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Abstract. The simulation software for the ATLAS Experiment at the Large Hadron Collider is being used 
for large-scale production of events on the LHC Computing Grid. This simulation requires many compo- 
nents, from the generators that simulate particle collisions, through packages simulating the response of 
the various detectors and triggers. All of these components come together under the ATLAS simulation 
infrastructure. In this paper, that infrastructure is discussed, including that supporting the detector de- 
scription, interfacing the event generation, and combining the GEANT4 simulation of the response of the 
individual detectors. Also described are the tools allowing the software validation, performance testing, 
and the validation of the simulated output against known physics processes. 
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1 Introduction 



2 ATLAS Offline Software Overview 



ATLAS [I], one of the general-purpose detectors at the 
Large Hadron Collider p], began operation in 2008. The 
detector will collect data from proton-proton collisions 
with center-of-mass energies up to 14 TeV, as well as 
5.5 TeV per nucleon pair in heavy ion (Pb-Pb) collisions. 
During proton-proton collisions at the design luminosity 



of 10 34 



cm 



beam bunches will cross every 25 ns 



(40 MHz) and provide on average 23 collisions per bunch 
crossing. ATLAS has been designed to record up to 200 
bunch crossings per second, keeping only the most inter- 
esting interactions for physics analyses, including searches 
for new physics. 

In order to study the detector response for a wide range 
of physics processes and scenarios, a detailed simulation 
has been implemented that carries events from the event 
generation through to output in a format which is identi- 
cal to that of the true detector. The simulation program 
is integrated into the ATLAS software framework, Athe- 
na [3] , and uses the Geant4 simulation toolkit Ep] . The 
core software and large-scale production infrastructures 
are discussed further in Section [2] 

The simulation software chain is generally divided into 
three steps, though they may be combined into a single 
job: generation of the event and immediate decays (see 
Section p3|) , simulation of the detector and physics inter- 
actions (see Section [5]) , and digitization of the energy de- 
posited in the sensitive regions of the detector into volt- 
ages and currents for comparison to the readout of the 
ATLAS detector (see Section [6]) . The output of the sim- 
ulation chain can be presented in cither an object-based 
format or in a format identical to the output of the ATLAS 
data acquisition system (DAQ) . Thus, both the simulated 
and real data from the detector can then be run through 
the same ATLAS trigger and reconstruction packages. 

The ATLAS detector geometry used for simulation, 
digitization, and reconstruction is built from databases 
containing the information describing the physical con- 
struction and conditions data. The latter contains all the 
information needed to emulate a single data-taking run of 
the real detector (e.g. detector misalignments or tempera- 
tures). The same geometry and simulation infrastructure 
is able to reproduce the test stands and installation config- 
urations of the ATLAS detector. The detector description 
is discussed in Section |4j 

Large computing resources are required to accurately 
model the complex detector geometry and physics descrip- 
tions in the standard ATLAS detector simulation. This 
has led to the development of several varieties of fast sim- 
ulation. Each is best suited to a particular use-case, and 
they are described in Section [7| Validation of the soft- 
ware, testing of the software performance, and validation 
of the physics performance and output of each piece of the 
simulation software chain is discussed in Section [8] 

This paper reviews the status of the software and ge- 
ometry used for large-scale production in 2008. 



The ATLAS software framework, Athena 13], uses Py- 
thon as an object-oriented scripting and interpreter lan- 
guage to configure and load C++ algorithms and objects. 
Rather than develop an entirely new high-energy physics 
data processing infrastructure, ATLAS adopted the Gaudi 
framework |6,7|, originally developed for LHCb and writ- 
ten in C++. Gaudi was created as a flexible framework to 
support a variety of applications through base classes and 
basic functionality. As much as possible, the infrastructure 
relies on the CLHEP common libraries J8J, which include 
utility classes particularly designed for use in high-energy 
physics software (e.g. vectors and rotations). 

Athena releases are divided into major projects by 
functionality [9|, and all of the ATLAS simulation soft- 
ware (including event generation and digitization) resides 
in a single project. The dependencies of the "simulation" 
project are the "core" project, which includes the Athena 
framework, the "conditions" and "detector description" 
projects, which include all code necessary for the descrip- 
tion of the ATLAS detector, and the "event" project, 
which includes descriptions of persistent objects. The num- 
ber of lines of code by software language for the simulation 
project are summarized in Table [II as calculated using 



cloc 10 



in Athena release 14.4. Lines of code in the up- 
stream Athena projects, excluding external dependencies 
like Gaudi and CLHEP, are summarized in Table [2] 



Table 1. Numbers of files, lines of code, and lines of com- 
ments in the ATLAS simulation project, by programming lan- 
guage for major contributors. External dependencies are not 
included. 



Language 


Files 


Comment 


Code 


C++ 


930 


24,000 


120,000 


FORTRAN 


270 


15,000 


42,000 


C/C++ Header 


1,100 


13,000 


34,000 


Python 


430 


16,000 


27,000 


HTML 


62 


130 


15,000 


Bourne Shell 


390 


1,000 


7,300 


C Shell 


380 


210 


3,800 


XML 


52 


1,200 


3,400 


Sum 


3,600 


70,000 


250,000 



All Athena jobs consist of three distinct steps. First, in 
the initialization step, services and algorithms are loaded 
on demand using dynamic library loading. Generally, al- 
gorithms include methods to be called once per event, 
whereas services may be accessed many times during a 
single event. The configuration and initialization is con- 
trolled within a common Python infrastructure which 
allows introspection, particularly useful in debugging and 
providing help for the users. Also, by using a scripting 
language for loading and configuring objects, there is no 
need to recompile C++ code or a script for each job. Small 
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Table 2. Numbers of lines of code in each of the projects 
upstream of the ATLAS simulation project, versus the pro- 
gramming language. Most projects are dominated by C++ and 
Python code. The most significant exception is the detector 
project, which contains 70,000 lines of XML and Java code. 



Project C/C++ C/C++ Python Total 

Code Headers Code Code 



Core 


390,000 


43,000 


240,000 


860,000 


Event 


200,000 


110,000 


16,000 


350,000 


Conditions 


280,000 


90,000 


21,000 


620,000 


Detector 


38,000 


6,100 


8,400 


140,000 



Sum 



910,000 



250,000 



280,000 2,000,000 



modifications can be made in the scripts (also called "frag- 
ments" or "job options"), or even in the midst of the job, 
without having to stop and recompile the libraries. This 
scripting method also lightens the load on the user, since 
there is, under normal circumstances, no need to com- 
pile anything prior to running a job. Each algorithm and 
service can be configured differently for each step of the 
simulation software chain, allowing maximal sharing of in- 
frastructure among the distinct steps of the chain. Algo- 
rithms can be added to a top list of methods to be run 
during the event loop. 

Second, the event loop begins. All algorithms in the 
top list are run sequentially on each event. An external 
generator or algorithm controlling Geant4 may be added 
to this list, for example. From these main methods, other 
services and algorithms can be called. A messaging ser- 
vice, called throughout the jobs, controls log file outputs 
with different levels of verbosity. The user may configure 
the total logging verbosity or configure the verbosity in- 
dividually for a single algorithm, particularly useful for 
debugging. 

During the finalization stage of the job, all algorithms 
are terminated and all objects are deleted. At this point, 
algorithms may output any statistics (e.g. memory or CPU 
usage) they track. 

These three steps comprise each Athena job, but the 
infrastructure allows for the insertion of hooks at various 
places. Each step of the ATLAS simulation chain takes 
advantage of this infrastructure to provide maximal flexi- 
bility for the user. Only requested modules are loaded as 
plug-ins, keeping each step as light as possible in memory 
and as fast as possible during the event loop. 

For storing data, ATLAS has adopted a scheme for 
separating transient from persistent objects. Most general 
CH — h types, immediately prior to storage, are converted 
to a type that requires less space. Although, for exam- 
ple, energy is accumulated in the calorimeter by summing 
double-precision floating point numbers, at the end of each 
event and prior to storage, the total energy is converted 
into a single-precision floating point number (float) . Sum- 
ming with floats was found to alter the total energy be- 
cause of truncation. For some types, more complicated 



storage schemes are implemented that rely on properties 
of the information to be stored (e.g. where it is possi- 
ble to sacrifice some accuracy). Metadata, general prop- 
erty information for data collected in a file, are included 
in the output files for all the stages of the event simula- 
tion. The metadata include all configuration information 
for the job. Athena has also adopted the POOL (Pool Of 
persistent Objects for LHC) file handling and persistency 
framework 
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2.1 ATLAS Simulation Overview 

An overview of the ATLAS simulation data flow can be 
seen in Figure [I] Algorithms and applications to be run 
are placed in square-cornered boxes, and persistent data 
objects are placed in round-cornered boxes. The opti ona l 
steps required for pile- up or event overlay (see Section 6.2 1 
are shown with a dashed outline. 

A generator produces events in standard HepMC for- 
mat 14 . These events can be filtered at generation time 



so that only events with a certain property (e.g. leptonic 
decay or missing energy above a certain value) are kept. 
The generator is responsible for any prompt decays (e.g. Z 
or W bosons) but stores any "stable" particle expected to 
propagate through a part of the detector (see Section [3]) . 
Because it only considers immediate decays, there is no 
need to consider detector geometry during the generation 
step, except in controlling what particles are considered 
stable. During this step, the run number for the simu- 
lated data set and event numbers for each event are es- 
tablished. Event numbers are generally ordered in a single 
job, though events may be omitted because of filtering 
at each step. Run numbers for simulated data sets de- 
rive from the job options used to generate the sample and 
mimic real run numbers used during data taking. 

These generated events are then read into the simula- 
tion. A record of all particles produced by the gene rato r 
is retained in the simulation output file (see Section 3.6), 



but cuts can be applied to select only certain particles 
to process in the simulation. Each particle is propagated 
through the full ATLAS detector by Geant4. The con- 
figuration of the detector, including misalignments and 
distortions, can be set at run time by the user. The ener- 
gies deposited in the sensitive portions of the detector are 
recorded as "hits," containing the total energy deposition, 
position, and time, and are written to a simulation output 
file, called a hit file. 

In both event generation and detector simulation, in- 
formation called "truth" is recorded for each event. In 
the generation jobs, the truth is a history of the interac- 
tions from the generator, including incoming and outgoing 
particles. A record is kept for every particle, whether the 
particle is to be passed through the detector simulation 
or not. In the simulation jobs, truth tracks and decays for 
certain particles are stored. This truth contains, for exam- 
ple, the locations of the conversions of photons within the 
inner detector and the subsequent electron and positron 
tracks. In the digitization jobs, Simulated Data Objects 
(SDOs) are created from the truth. These SDOs are maps 
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Fig. 1. The flow of the ATLAS simulation software, from event generators (top left) through reconstruction (top right). 
Algorithms are placed in square-cornered boxes and persistent data objects are placed in rounded boxes. The optional pile-up 
portion of the chain, used only when events are overlaid, is dashed. Generators are used to produce data in HepMC format. 
Monte Carlo truth is saved in addition to energy depositions in the detector (hits). This truth is merged into Simulated Data 
Objects (SDOs) during the digitization. Also, during the digitization stage, Read Out Driver (ROD) electronics are simulated. 



from the hits in the sensitive regions of the detector to the 
particles in the simulation truth record that deposited the 
hits' energy. The truth information is further processed in 
the reconstruction jobs and can be used during the analy- 
sis of simulated data to quantify the success of the recon- 
struction software. 

The digitization takes hit output from simulated e- 
vents: hard scattering signal, minimum bias, beam halo, 
beam gas, and cavern background events. Each type of 
event can be overlaid at a user-specifed rate before the 
detector signal (e.g. voltage or time) is generated. The 
overlay (called "pile-up") is done during digitization to 
save the CPU time required by the simulation. At this 
stage, detector noise is added to the event. The first level 
trigger, implemented with hardware on the real detector, 
is also simulated in a "pass" mode. Here no events are 
discarded but each trigger hypothesis is evaluated. The 
digitization first constructs "digits," inputs to the read 
out drivers (RODs) in the detector electronics. The ROD 
functionality is then emulated, and the output is a Raw 
Data Object (RDO) file. The output from the ATLAS de- 
tector itself is in "bytestream" format, which can be fairly 
easily converted to and from RDO file format. The two 
are similar, and in some subdetectors they are almost in- 
terchangeable. Truth information is the major exception. 
It is stripped in the conversion to bytestream. 

The simulation software chain, divided in this way, uses 
resources more effectively than a single-step event simula- 
tion and simplifies software validation. Event generation 
jobs, typically quick and with small output files, can be 
run for several thousands of events at a time. By storing 



the output rather than regenerating it each time, it be- 
comes possible to run identical events through different 
versions of the simulation software or with different de- 
tector configurations. The simulation step is particularly 
slo w, an d can take several minutes per event (see Sec- 
tion 8.2 1. Simulation jobs are therefore divided into groups 



of 50 or fewer events; only a few events may be completed 
in a single heavy ion simulation job. Digitization jobs are 
generally configured to run ~ 1000 events. This configura- 
tion eases file handling by producing a smaller number of 
RDO files. Each step is partially configured based on the 
input files. For example, the detector geometry used for a 
digitization job is selected based on the in put hit file. 
The ATLAS high level triggeiQ (HLT) 



struction 
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and rccon- 



run on these RDO files. The reconstruction 
is identical for the simulation and the data, with the ex- 
ception that truth information can be treated and is avail- 
able only in simulated data. During data taking, the HLT 
is run on bytestream files, however all hypotheses and ad- 
ditional test hypotheses may be evaluated by translating 
the RDOs into bytestream format. 

2.2 Large-Scale Production System 

Because of the significant time consumption of the AT- 
LAS simulation, only minimal jobs can be completed in- 
teractively on most computers. It is, therefore, desirable 



1 The ATLAS high level trigger comprises two stages: level 
2, and the event filter. Both are software triggers run with the 
reconstruction, and may be treated as a single unit for the 
purposes of this discussion. 
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to distribute as much as possible production of the nec- 
essary simulated data for ATLAS. Complete software re- 
leases are built and distributed to production sites and 
users every few weeks, providing all Athena software and 
all external dependencies, including generators and Ge- 
ANt4. These releases are patched several times with "pro- 
duction caches" before a new clean release is built and dis- 
tributed. With each release or production cache, a small 
set of data files are packaged that include database repli- 
cas, any necessary external data files, and some sample 
output files. These sample files can be used to ensure that 
the locally installed release can be validated by processing 
events through the entire software chain, from generation 
through reconstruction. 

Large-scale production is then done on the World-wide 
LHC Computing Grid ("WLCG" or "Grid") fT?}. A sin- 
gle task on the Grid (e.g. simulation of 500,000 tt events) 
is separated into many jobs depending on the content 
and complexity of the task. A job can be completed by 
a single CPU within the maximum allowed time for a job 
on the Grid (typically 2-3 days). The output, including 
log files, of every Grid job is registered with the ATLAS 
Distributed Data Management system (DDM) [18]. The 
DDM uses DQ2 [19] for dataset bookkeeping, and allows 
users to search for datasets on the Grid, analyze them in 
place, and, if necessary, retrieve them. Separate Grid soft- 
ware controls the distribution of jobs to the various Grid 
sites. In a typical task, 10 jobs are queued and run as a 
test sample, and only once they finish successfully are the 
remainder of the jobs released to the Grid. In the case of 
a full chain of jobs being run (generation, simulation, and 
digitization), each subsequent step is automatically held 
in the queue until the required data are available from 
the previous step. Frequently, Grid jobs are configured to 
run two steps (e.g. simulation and digitization together, 
or digitization and reconstruction together). About one 
million events per day can be produced using Geant4 on 
the Grid. 

On the Grid, "job transforms" are run, which may only 
include well-defined, minor modifications to some stan- 
dard job configuration after the input events have been 
specified. A task is given a random number seed, and each 
job increments the seed in sequence. The modifications to 
a generation job also may include a configuration file for 
the selected generator to be run. These configuration files 
are included with each release and may not be arbitrarily 
modified by the user during production. The modification 
to a simulation job may include detector geometry and 
conditions and specially designed job options fragments 
that are included with each release. These fragments are 
typically constructed for a very specific purpose, for ex- 
ample a non-standard vertex smearing, simulation of cav- 
ern background, or propagation and late decay of long- 
lived exotic particles. Many of these modifications can be 
chained to provide maximal flexibility to the user, but 
if two fragments are sufficiently complex such chaining 
becomes impossible. The modifications to a digitization 
job may include geometry and conditions versions, calori- 
meter sampling fraction, trigger configuration, and noise 



control. These modifications are discussed further in the 
subsequent sections. 



3 Event Generation Overview 

Event generation consists of the production of a set of par- 
ticles which is passed to either full or fast detector simula- 
tion. Event generation runs within the Athena framework, 
but most of the generators themselves are written and 
maintained by authors external to ATLAS. The ATLAS- 
specific implementation, therefore, consists mostly of a set 
of interface packages. These are designed to be as sim- 
ple as practicable and wherever possible to be factorized 
from the external packages. This is essential to allow rapid 
feedback and bug reporting to the authors of the exter- 
nal packages. Most of the well-understood and thoroughly 
debugged generators are written in FORTRAN. Their in- 
terfaces transfer the event information, mostly contained 
in FORTRAN common blocks, into an object format that 
can be used by the ATLAS software. This ensures that any 
downstream algorithms are shielded from details specific 
to an individual generator. Events can either be stored as 
POOL files for later use or passed to simulation in the 
same Athena job. 

Details of the framework and comments specific to 
each generator are listed below. Large-scale production 
has been run with Pythia [20] (including an ATLAS vari- 
ant, PythiaB [2l"|22 , used for production of events with B- 



25 , Sherpa £6L_Hijing 27 , Alp 



, and AcerMCl30L Tauola 31 and 



hadrons), Herwig 
gen(28],MC@NLO 
Photos [32] are routinely used to handle tau decays and 



photon emission. EvtGen 33 is used for B-decays in cases 



where the physics is sensitive to details of the B hadron 
decaysF] ISAJET [34] is used for generating supersym- 
metric particles in conjunction with Herwig. Th e ne wer 
C++ generators Pythia 8 [35] and Herwig++ 



:-!(> 



are 

being tested. Both produce events in the HepMC for- 
mat, so no translation is needed. They can be passed 
directly to simulation. As these new generators evolve 
and undergo extensive testing and validation, they are ex- 
pected to enter the production shortly and eventually su- 
persede their FORTRAN predecessors. Some production 
was also done with MadGraph [37] (vector boson scatter- 
ing) , CHARYBDIS [38] (black hole event generation) , and 
CompHcp 39, 40] (specific exotic physics models). Discus- 
sion of the generation of cavern background, beam halo, 
and beam gas events follow in Sections |6.2.1| and |6.2.2| 
Single particle generators are also used to generate cos- 
mic ray events and single particle events for performance 
studies and calibration of the detector. 

Each generated event contains the particles from a sin- 
gle interaction with a vertex located at the geometric ori- 
gin. Modifications to account for the beam properties are 
applied to t he event before it is passed to Geant4 (see 
Section 5.1 ). Particles with a proper lifetime ct > 10 mm 



2 Pythia remains the default for current inclusive produc- 
tion, but EvtGen is likely to be used by default for the long- 
term production. 
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are considered stable by the event generator. They can 
propagate far enough to interact with detector material 
before decaying. Their decays are handled by the simula- 
tion. Any particles with ct < 10 mm are decayed by the 
event generator, and their interactions with material or 
curving in the magnetic field of ATLAS are ignored. 



3.1 Generator Framework 

Many external generator packages assume that the pa- 
rameters for a particular job are set via a main program. 
This would require recompilation to change parameters. 
The Athena generator interfaces allow for the passing of 
all relevant parameters at run time, permitting a fixed 
software release to be used to produce different physics 
configurations. During initialization, the relevant param- 
eters are passed via Python fragments. The combination 
of the fragments, random number seeds, and the software 
release uniquely identifies the resulting datc[j The Athe- 
na event manager is run for each event, and a run number 
and an event number are created; then the event gener- 
ator is asked to produce an event. This event is created 
in memory in the format specific to the generator itself. 
The event must then be mapped into a common format so 
that subsequent algorithms are independent of the gener- 
ator used. 



ATLAS uses the HepMC event record 14 , initially de 



veloped by the ATLAS collaboration but now supported 
by WLCG 17 . This is a set of C++ classes which holds 



the full event as produced by the generator. Stable par- 
ticles are used as input to simulation; unstable ones can 
be of use in physics studies and diagnostics. Each event 
generator produces a very large number of stable parti- 
cles (e.g. muon, kaons, pions, electrons, photons), a much 
larger number of unstable particles (e.g. gluons, quarks, B 
mesons, heavy hyperons), and, possibly, other objects (e.g. 
"strings" or "color singlet clusters" ) specific to an individ- 
ual generator. The HepMC record consists of a connected 
tree, navigation inside of which retains information on the 
event history including the parents of unstable particles. 
There is an important caveat here: the event generators 
are modeling quantum processes, and the event record has 
the structure of a classical decay chain. It is inevitable 
that compromises must be made and difficulties can arise 
from an over-literal interpretation of the tree structure. A 
very simple example is provided by events containing an 
e + e~ pair. The parent of the e + e~ pair cannot be uniquely 
specified, as the pair may arise from an intermediate Z bo- 
son, photon, or quantum interference. The HepMC event 
record is also used to contain the particle information from 
secondaries produced by interactions in the detector. This 
is discussed belo w in the section on Monte Carlo truth 
(see Section 3.6). Information about all interacting par- 
tons (e.g. momentum fractions x\ and x%) is saved, so 
that parton distribution function reweighting can be done 
without rerunning the event generation. 



The FORTRAN generators usually use the HEPEVT 
common block [41] to store the information. Unfortunate- 
ly, the different generators use slightly different structures. 
A separate translation into HepMC is needed for each one. 
The C++ generators such as HERWIG++ produce output 
in the HepMC format. No translation is required and the 
integrity of the HepMC event record is the responsibility 
of the generator authors. 



3.2 General Purpose Generators 

General purpose generators produce complete events start- 
ing from a proton-proton, proton-nucleus or nucleus-nu- 
cleus initial state. They are used standalone or with spe- 
cialized generators that improve the description of certain 
final states. They have many parameters, some of which 
are related to fundamental parameters such as the QCD 
coupling constant and electroweak parameters, and some 
of which describe the models used to parametrize long 
distance QCD, soft QCD, and electroweak processes. 



3.2.1 Pythia and PythiaB 



Pythia 20 and Herwig (see below) in their FORTRAN 



versions have been tested, used, and validated over many 
years in e + e~ and hadron colliders. They start with a 
hard scattering process calculated to lowest order in QCD. 
They then add additional QCD and QED radiation in a 
shower approximation which is most accurate when the 
radiation is emitted at small angle. The approximation is 
poorest in those cases with a large number of widely sepa- 
rated emissions of comparable energy. In addition, Pyth- 
ia use a model for hard and soft scattering processes in 
a single event in order to simulate underlying activity. 
This model is used in the simulation of minimum bias 
events. While other generators may be used for specific 
final states, Pythia and Herwig are the benchmarks. 

ATLAS uses Pythia 6.4. There arc two models of 
QCD radiation in Pythia. By default, ATLAS uses the 
showering model introduced in Pythia 6.3. This show- 
ering model is believed to better match the theoretical 
description of QCD showers. It produces somewhat more 
jet activity [42j|43], resulting in "busier" events than the 
older model which was used, for example, for detailed sim- 
ulations at the Tevatron (see, for example, 44p5] ). In this 
model, the multiple scatters which make up the underly- 
ing event are interleaved with the parton shower according 
to the hard scale of the scatter or the emission. At the end 
of the shower, a phenomenological model is used to com- 
bine the quarks and gluons into hadrons. This hadroniza- 
tion model, which has many parameters, has been tuned 
by comparison with data in e + e~, ep, and hadron col- 
liders 46l 47 . The underlying event model was retuned 
within ATLAS 48 to recover an acceptable description of 



the Tevatron data 49 50 . Pythia contains a very large 



Since pseudo-random number generators are chip architec- 
ture dependent, jobs are exactly reproducible only when run 
on the same type of processor (e.g. Intel or AMD). 



number of built-in processes, and new ones can be added 
by modifying the code. Hard scattering events can also be 
generated in a separate program in a standard format and 
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fed into Pythia for the addition of a parton shower and 
hadronization. Pythia is the default generator in ATLAS: 
many hundreds of millions of events have been generated 
using it. Its ease of use, speed, and robustness make it an 
ideal choice for the default. It is supplemented by other 
generators, either to obtain some estimate of the uncer- 
tainties, or when specialized generators are expected to 
give a better physical description in certain final states 
PythiaB ~ 



22] is an ATLAS-specific modification of 
Pythia aimed at the efficient generation of events related 
to B-physics. In Pythia, most high px bottom quarks 
are produced in the QCD shower of a high pt light quark 
or gluon from a hard scattering process. Most showers do 
not produce such a bb pair, so using Pythia to generate 
B-physics events is inefficient. PythiaB reuses those QCD 
showers that contain a b- or c-quark, hadronizing them 
several times to increase the probability of producing a 
b-hadron. Since the probability producing a b- or c-quark 
in a parton shower is low, this procedure results in more 
efficient procedure of making 6-hadron events without in- 
troducing any bias in the distribution of 6-hadrons within 
the event. If a specific decay mode is then required the 
6-hadron decay can be forced using a modified &-hadron 
decay table, either in Pythia itself or via EvtGen. 



3.2.4 Hijing 



3.2.2 Herwig 



ATLAS uses Herwig 6.5 [23j{25 
FORTRAN Herwig package w 



the last release of the 
hich is now superseded by 



HerwigH — h (see below). It is a flexible generator with a 
large number of built in processes and has been tuned to 
agree with the Tevatron data 49 50 . In particular, most 



of the generation of supersymmetric processe s is done with 
Herwig using the ISAWIG package [23| - f25] with the par- 
ticle spectra and decay modes generated by ISAJET. AT- 
LAS uses Herwig with the Jimmy 51 implementation 
of the underlying event. 



3.2.3 Sherpa 

Sherpa [26] is a generator written in C-\ — h which imple- 
ments the CKKW duplicate removal prescription [52] to 
match fixed-order QCD matrix elements to QCD showers. 
It uses an interface to Pythia's hadronization model and 
produces complete events. It is expected to give better 
approximations for final states with large numbers of iso- 
lated jets than generators such as Pythia and Herwig 
based on pure QCD showering. Sherpa generates underly- 
ing events using a simple multi-parton interaction model 
based on that of Pythia. For each new process to be gen- 
erated, Sherpa must be recompiled to incorporate the spe- 
cific libraries for the process of interest. On the Grid, this 
implies either recompiling Sherpa at the production site 
or deploying updated libraries for new production jobs. 
Instead, Sherpa is run locally to produce event files in 
Sherpa's native format. These files are then translated into 
the HepMC format with an additional Athena Grid job. 
It is also possible to run Sherpa entirely within an Athena 
job. 



Hijing 27 is a dedicated generator for the production of 



heavy ion events at all impact parameters. In a dense nu- 
clear environment, such as appears in central collisions, 
a particle produced in a primary collision can re-interact 
several times as it propagates. Hijing models the propa- 
gation. It is also the only generator that can be used for 
proton-nucleus collisions occurring in beam-gas interac- 
tions. Hijing uses the Pythia hadronization model. 



3.2.5 Single Particle Generators 

A single particle event generator is frequently used for 
calibrating the detector, testing, and evaluating the re- 
construction efficiencies. Although unphysical, these gen- 
erators produce events with a single primary particle, for 
example a muon, electron, or charged pion, at a speci- 
fied energy, position, and momentum direction. A range 
may also be specified for either the energy or direction. 
No underlying event, proton remnants, or other primary 
interactions are included when these events are generated. 
A specialized single particle generator is used to pro- 
duce cosmic ray events. Single muons are generated at 
the earth's surface in a square region (typically 600 m 
by 600 m) above the ATLAS detector and with the stan- 
dard cosmic ray pt spectrum |53||54] . The upper and lower 
energy cutoffs of the spectrum are configurable. Those 
muons pointing to a sphere of configurable size (typically 
20 m) centered at the geometric origin are propagated 
through the bedrock and the ATLAS cavern during sim- 
ulation. 



3.3 Specialized Generators 

Specialized generators do not produce complete events 
which can be passed directly to simulation. Rather, they 
are run in conjunction with one of the general purpose 
generators to improve the accuracy for specific decays or 
specific final states. Several of these specialized genera- 
tors are "Les Houches" type generators. That is, they are 
run standalone using unmodified code from the genera- 
tor author and produce an ASCII file containing partonic 
four-vectors in the "Les Houches" format 55 56 . Athe- 



na uses a common interface that reads in thesehles and 



prepares them for processing in Pythia or Herwig 55 . 



3.3.1 ISAJET 



The FORTRAN generator ISAJET [34] is not used in 
large-scale production. However, it is used in conjunction 
with Herwig for generation of supersymmetric events. 
Here, the ISASUGRA component of ISAJET is used to 
generate consistent sets of masses and decay modes for 
supersymmetric models. These are then loaded into Her- 
wig using the ISAWIG translation package, and Herwig 
then generates complete final states. 
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3.3.2 Photos and Tauola 

ATLAS uses the dedicated tau decay package Tauola to 
handle tau decays 



31 



General purpose generators are 
set to treat tau leptons as stable. The events are passed to 
Tauola for decay. Because Tauola is a FORTRAN package, 
the events are extracted from the HEPEVT record. The 
Tauola interface is dependent on the generator that pro- 
duced the tau, because helicities and helicity correlations 
are passed in generator-dependent formats. The original 
generator's results must be replaced, so both the input and 
output formats of Tauola are in fact generator-dependent. 
Special attention is paid to the polarization of the tau. 
In certain cases, for example the decay W^ 1 — > t^v t , 
the polarization is known for the tau. In others, such as 
Z — > t + t~ , there is a correlation between the polarization 
of the taus. 

Photos handles electromagnetic radiation [32]. It is 
used by Tauola, and, therefore, Tauola cannotbe used 
without Photos. Photos is also used to improve the de- 
scription of electromagnetic radiation in, for example, the 
decay W — > e v e , where radiation distorts the electron 
energy distribution. In these cases the final state electro- 
magnetic radiation is switched off in the general purpose 
generator, usually Herwig or Pythia, to avoid double 
counting. 



very inefficient for final states with large numbers of jets, 
and generation time can be significant. 



3.3.5 MCONLO 



MC@NLO 29 , which is also a "Les Houches" type gen- 
erator, runs standalone to produce ASCII files which are 
then processed by Herwig running inside of Athena. MC- 
@NLO uses fundamental (hard scattering) processes eval- 
uated at next to leading order in QCD perturbation the- 
ory. It is used, for example, to generate top events as it 
gives a better representation of the transverse momen- 
tum (pr) distribution of top quarks than Pythia or Her- 
wig. MC@NLO includes one loop corrections, with the 
consequence that events appear with negative and positive 
weight which must be taken into account when they are 
used. Any resulting distribution will contain entries from 
both types of event, and, given sufficient statistics, the re- 
sult will by physical (i.e. positives] MC@NLO has been 
used for large-scale production of top, W and Z events. 
Only the parts of MC@NLO needed to read these events 
and process them via Herwig are included in Athena re- 
leases. 



3.3.3 EvtGen 



3.3.6 AcerMC 



EvtGen [33j, originally developed by the CLEO collabora- 
tion, provides a more complete description of B meson and 
hadron decays than that provided by defaults in Pyth- 
ia or Herwig. Recent modifications have been made to 
handle B$ and b-baryon decays, incorporating measure- 
ments from the Tevatron, BaBar, and Belle. In particular, 
EvtGen incorporates the best measurements of branch- 
ing ratios and has theoretical models for unmeasured de- 
cay modes. It includes angular correlations, which impact 
the acceptance for certain decay modes of B mesons and 
baryons. It has been used for ATLAS studies involving the 
prospects for measurements of exclusive B decays. 



3.3.4 Alpgen 

Alpgen [28] is a "Les Houches" type generator enabling 
more sophisticated generation of certain final states. Her- 
wig or Pythia is then used to perform the hadronization 
and produce final (and initial) state QCD radiation. Alp- 
gen is targeted at final states with several well-separated 
hadronic jets where the fixed order QCD matrix element is 
expected to give a better approximation than the shower 
approximation of Pythia or Herwig. Alpgen is used, for 
example, to generate final states containing a W or Z 
and many jets. Alpgen also provides an algorithm to pre- 
vent double counting by event rejection. The Athena inter- 
face package includes the methods needed to pass events 
through Herwig or Pythia and veto those events that 
would contribute to double counting. This process can be 



AcerMC 30 is a "Les Houches" type generator aimed pri- 
marily at the production of W or Z bosons with several 
jets, including jets originating from b-quarks. A partonic 
final state is obtained by running it standalone and mak- 
ing an external ASCII file. Only the parts needed to read 
these events and process them via Pythia are included in 
Athena releases. 



3.4 New C++ Generators 

3.4.1 Pythia 8 

Pythia 8 35 is a rewrite of the FORTRAN Pythia in 
CH — h with new and expanded physics models. It provides 
a new user interface, transverse-momentum-ordered show- 
ers, and interleaving with multiple interactions. The pro- 
gram is under intensive tests and it will require some fur- 
ther tunings before it can replace the Pythia6 code as a 
leading generator. It is, however, interfaced to Athena and 
used for generator studies in ATLAS. It includes support 
for both "Les Houches" and HepMC event formats. 



57 



implements essen- 



4 An alternative tool, POWHEG 
tially the same physics and produces events with only posi- 
tive weight. Once it includes all the processes that MC@NLO 
does and has been validated, it is expected to take the place 
of MCQNLO. 
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3.4.2 HERWIG+ + 

Herwig++ [36] is the C++ based replacement for Her- 
WIG. It contains only important processes from the Stan- 
dard Model, the universal extra dimensions model, and 
supersymmetric models (whose details are specified via 
Supersymmetric Les Houches Accord model files 58 59]). 
Additional hard scattering processes can be used via "Les 
Houches" input from specialized generators, and addition- 
al decay models can be added by users. 

HerwigH — h will soon be used for generation of some 
Standard Model processes, notably W and Z production. 
It will also be used for supersymmetric processes, because 
it includes full spin correlations and QCD radiation in 
the supersymmetric decay chains. The current version of 
HerwigH — h also incorporates an underlying event model 
based on the extension of Jimmy 51 to include soft scat- 
ters 60 and can thus potentially generate minimum bias 



physics 



3.5 Parton Distribution Functions 

Parton distribution functions (PDFs) are used to describe 
the substructure of the proton and are used by all the 
event generators as external inputs. ATLAS uses the Les 
Houches Accord PDF Interface (LHAPDF [6l]) library 
which is a replacement for PDFLIB [62] which provides a 
large repository of PDFs. CTEQ 63 PDFs are used by de- 



fault (MC@NLO uses NLO PDFs, and all other generators 
use LO PDFs). There is a correlation between the PDFs 
and the tuning of parameters connected to initial state 
radiation [64j[65]: inconsistent results can be obtained by 
varying the PDFs in isolation. Therefore, when a new set 
of PDFs is used, the parameters of the event generator are 
retuned to produce consistent results 42 . 



3.6 Monte Carlo Truth 

The entire connected tree of the HepMC event record is 
stored as the Monte Carlo truth. Only the stable parti- 
cles are propagated by the simulation. The various sta- 
tus codes and event history provided by the individual 
generators are retained within the HepMC event record. 
Unfortunately, much of this information is specific to a 
particular generator. Only status codes 1 (stable) and 2 
(unstable) have a general meaning: the remaining values 
are used differently by the individual generators. As re- 
marked in Section |3.1| there can be ambiguities resulting 
from the attempt to represent a quantum process by a 
classical tree. Some filters have been provided to select 
HepMC particles that, for example, are stable at the gen- 
erator level or are non-interacting (e.g. neutrinos). 

When the simulation is run, the HepMC tree from 
the event generator is copied, and some particles resulting 
from decays within, or interactions simulated by, Geant4 
are added to the copy (see Section 5.3 ). In this way, a com- 



a particle decayed by Geant4 but considered stable by 
the generator (such as a Kg) has its status code changed 
when the copy is made. A particle that has status code 
2 after simulation will be identified as stable at the gen- 
erator level, if the decay took place in Geant4. Geant4 
secondaries are distinguished from those from the genera- 
tors by an offset applied to their numerical identifier. The 
resulting Monte Carlo truth record can be large and ac- 
count for a significant fraction (~30%) of the disk space 
used by a simulated event after reconstruction. 



3.7 Default Parameters, Tuning and Bug Fixing 

The generator authors define default parameters. In some 
cases, however, these parameters are not tuned for use 
at the Large Hadron Collider and are superseded by pa- 
rameters obtained by comparisons to data. The criteria 
for a particle to be considered stable are modified for use 
in ATLAS, for example. Once high-energy data appear, 
it is expected that retuning of the parameters will occur. 
These tunings can be made by varying parameters at run 
time. Once a new tuning is available, it can be loaded as a 
Python fragment at run time or hard coding the values 
into the generator interfaces. In either case, the tuning be- 
comes available as part of the next Athena software release 
and will be enabled by default. The settings can be over- 
ridden if needed or the previous defaults re-established. 
It is important to note that the parameters are often not 
independent and a complete set must be used. Arbitrary 
adjustments of a few parameters may result in inconsistent 
results. One of the most important sets of tunings is con- 
cerned with structure of minimum bias events and spec- 
tator processes in a hard scattering event: the underly- 
ing event. At present, thes e tunings are obtained for both 
Pythia and Herwig [42] by first tuning to the Tevatron 
data and then extrapolating. The extrapolation from the 
Tevatron relies on the models used by Pythia and Her- 
wig. This extrapolation has had testing from comparisons 
of the Tevatron data at 630 and 1800 GeV [46][66]. A high 
priority task for the ATLAS simulation as aata accumu- 
lates is the testing of these tunings and changing of the 
parameters as needed. 



4 ATLAS Detector Description 

The ATLAS detector is described in detail in Rcf. [I], 
but its main features will be summarized here. We discuss 
the geometry used in the simulation, which as much as 
possible matches the as-built detector. A cut-away view of 
the entire detector is shown in Figure [2] ATLAS comprises 
several concentric components. The subdetectors are: 

— A Beam Conditions Monitor (BCM) and Beam Loss 
Monitor used for detecting dangerous conditions and 
triggering an abort in the detector system. The BCM 
is located 1.84 m from the interaction point (IP) at 
M~4.2Q 



plete event including both the generator and simulation 
information is provided. In order to ensure consistency, 



Pseudorapidity, r\ = — lntan(#/2), where 6 is the polar an- 
gle measured from the beam pipe. The other coordinate vari- 
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Fig. 2. ATLAS detector view. 



— A tracking detector composed of a fine granularity 
pixel detector with three layers covering \r)\ < 2.5, a sil- 
icon strip tracker (SCT) with eight layers determining 
four space points covering \rj\ < 2.5, and a transition 
radiation tracker (TRT) which has 32 space points on 
a typical track, covering \tj\ < 2.0. 

— Hermetic calorimetry composed of liquid argon (LAr) 
electromagnetic calorimetry covering \rj\ < 3.2, scin- 
tillating tile hadronic calorimetry in the barrel (|r7| < 
1.7), sampling LAr hadronic calorimetry in the end- 
cap (1.5 < \ij\ < 3.2), and LAr electromagnetic and 
hadronic forward calorimetry covering 3.2 < \r)\ < 4.9. 

— Four different types of muon chambers, two of which 
are high precision (monitored drift tubes, and cathode 
strip chambers) and two of which have a rapid response 
for the muon trigger (thin gap chambers, and resistive 
plate chambers), covering \i]\ < 2.7. 

— Luminosity detectors, including a zero-degree calori- 
meter that sits 140 m from the interaction point, a 
detector that performs a luminosity measurement us- 
ing Cherenkov integration (LUCID), and an absolute 
luminosity detector for ATLAS. 

The ATLAS magnetic field is formed by a solenoid, 
providing a 2.0 T uniform magnetic field in the tracking 
subdetectors, and a toroidal magnet system, composed of 
a barrel and two endcap toroid magnets. In the inner de- 
tector, the field has small (f>- and z-asymmetries due to 



ables used are typically r, z and (j>, where the a;-axis points 
towards the center of the LHC ring, the y-axis points up, the z- 
axis defines a right-handed coordinate system, r = \/x 2 + y 2 , 
and <j> is the azimuthal angle defined such that <j> — along the 



the toroid field and perturbations from the iron nearby. 
The field in the toroidal system has approximate z- and 
eight-fold (^-symmetry and provides on average 2.5 Tm of 
bending power in the barrel and 5 Tm in the endcap. Dur- 
ing a simulation run, the field map required about 30 MB 
of memory. 



In the standard production simulation (see Section 2.2 ) 
the luminosity detectors are not included. They can be 
simulated in dedicated jobs, but keeping particles in such 
a high pseudorapidity region increases simulation time by 
approximately 50 % p er unit of pseudorapidity (\rj\) per 
event (see Section 5.1). 



Several layouts of the complete detector are available, 
including those that were used for recording cosmic ray 
events while the detector was being completed. Test stands 
are also supported with the same infrastructure. All these 
layouts are described in Section |4~4} As much as possible, 
the details of the detector geometry are preserved in the 
simulation layout. Some approximations are necessary for 
describing dead materials, for example bundles of cables 
and cooling pipes in the service areas of the detector. In 
these cases, the description only aims to match the general 
distribution of the material, including inhomogeneities in 



4.1 Simulated Detector Geometry 

The geometry structure can be viewed in terms of solids, 
basic shapes without a position in the detector; logical 
volumes, solids with additional properties (e.g. name or 
material); and physical volumes, individual placements of 
logical volumes. Table [3] shows the number of materials, 
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solids, logical volumes, physical volumes, and total vol- 
umes created when constructing various pieces of the AT- 
LAS detector. Not all volumes are equivalent, however: in 
the case of repeating structures, as in the sampling por- 
tion of the LAr calorimetry in particular, it is possible to 
define a single logical volume that is repeated in hundreds 
of physical volumes (known as volume parameterization). 
Because of nesting, one can also define dependencies that 
create many total volumes from the physical volumes used. 
In other cases, a single volume can correspond to a piece 
of shielding or support with a complex shape. One can see 
in this table the complexity of the ATLAS detector, with 
hundreds of materials and hundreds of thousands of physi- 
cal volumes. Such a detailed detector description is crucial 
for accurately modeling, for example, missing transverse 
energy, track reconstruction efficiencies, and calorimeter 
response. 

Table HI shows the number of physical volumes con- 
tained in each detector subsystem and the memory re- 
quir ed to build each using the GeoModel library (see Sec- 
4.3h [67]. As expected, the two are correlated, al- 
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though differences in volume complexity invalidate a di- 
rect correspondence. The entire geometry must be trans- 
lated into a Geant4 equivalent, so the total memory re- 
quired for the geometry of the entire ATLAS detector is 
almost 300 MB (see Section [8}. 



Table 4. Numbers of physical volumes and memory required 
to build various pieces of the ATLAS detector in GeoModel. 
Here "calorimetry" is simply the sum of the liquid argon and 
tile calorimetry. 



Subsystem 


Phys. Volumes 


Memory [kB] 


Inner Detector 
Calorimetry 
Muon System 


56,838 

182,262 

76,945 


22,268 

44,116 
31,524 


ATLAS TOTAL 


316,043 


97,908 



In creating such a complex, dense geometry, removing 
volume overlaps and touching surfaces provides a partic- 
ular challenge. Any overlap of more than 1 picometer and 
any place in which two volume faces touch can lead to 
stuck tracks during the simulation, a situation in which a 
track in Geant4 may not know in which volume it be- 
longs. These stuck tracks result in a loss of the event, but 
they can be overcome by introducing small gaps between 
volumes, at the cost of an extra step for each particle mov- 
ing through the transition region. 

Many layouts are available corresponding to the var- 
ious revisions of material. The material budget is con- 
stantly updated, so that the geometry description is as 
realistic as possible. During any major updates of detec- 
tor geometry, the subdetectors are generally required to 
make all changes backwards-compatible so that all older 
geometries can be configured and run as normal. This re- 
quirement allows for a fair comparisons between software 



releases with consistent geometries. During any job, the 
user may choose to enable or disable portions of the de- 
tector. Each subdetector is responsible for including any 
necessary materials and elements for its own construction. 
In this way, only required elements and materials are used 
during simulation, and memory loads are reduced when 
not using the entire ATLAS detector. The switches for 
disabling portions of the detector generally correspond to 
the highest level of the tree-structure in the detector ge- 
ometry (i.e. entire subdetectors, not pieces). 

It is possible to apply detector "conditions" modifica- 
tions to each chosen geometry layout. The detector mis- 
alignment can be configured by selecting misaligned lay- 
outs either for each subdetector or for the full detector at 
once. Each ATLAS subdetector sits within a well defined 
envelope, allowing each to shift and distort without col- 
liding with any other. In digitization and reconstruction 
jobs, conditions may include detector information beyond 
misalignments (e.g. dead channels). The infrastructure is 
in place to record detector conditions in a database and, 
at run time, allow the user to select conditions from a spe- 
cific data taking run. Conditions and geometry versions se- 
lected by the user can be transferred from the simulation 
jobs to the digitization and reconstruction jobs so that 
no additional user interaction is required. These default 
versions may at any time be overridden by job options. 

In order to study the penalties of a poor material de- 
scription on jet resolution and missing transverse energy 
bias, a special geometry layout with material distortions 
was created [68] . Material distortions correspond to addi- 
tional material added to half of the detector (y > 0) to 
approximate a poor material description. 



4.2 Databases and Configuration 

Two databases are used to construct the detector geome- 
try chosen by the user: one to store basic constants (the 
ATLAS Geometry database), and one to store various 
conditions data (e.g. calibrations, dead channel, misalign- 
ments) for the specific run chosen (ATLAS Conditions 
database) [69]. At CERN, large (terabytes) Oracle data- 
bases are used, primarily because they are well supported 
and straightforward to update. With any stable software 
release, a small subset of data nee ded for Athena jobs is 

file based da- 
tabases ■ 



replicated from Oracle into SQLite 70 ■ 72 

and is distributed to the production centers. The 
large I/O requirements of production jobs can overwhelm 
a central Oracle server and are better handled by rela- 
tively small SQLite files. These files can also be replicated 
to individual production nodes for local and rapid access. 
The database replica version to be used can be chosen at 
run time for each Grid job. 

Both the geometry and conditions databases support 
versioning of the data. The data are organized in a tree 
consisting of branch and leaf nodes. The nodes in this 
tree can be "tagged," and one can create a hierarchy of 
the tags. Such tag hierarchies are uniquely identified by 
the tag of the root node, which is usually referred to as 
top level geometry or conditions tag. 
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Table 3. Numbers of materials, solids, logical volumes, physical volumes, and total volumes required to construct various pieces 
of the ATLAS detector. "Inner Detector" here includes the beampipe, BCM, pixel tracker, SCT, and TRT. 



Subsystem 


Materials 


Solids 


Logical Vol. 


Physical Vol. 


Total Vol. 


Beampipe 


43 


195 


152 


514 


514 


BCM 


40 


131 


91 


453 


453 


Pixel 


121 


7,290 


8,133 


8,825 


16,158 


SCT 


130 


1,297 


9,403 


44,156 


52,414 


TRT 


68 


300 


357 


4,034 


1,756,219 


LAr Calorimetry 


68 


674 


639 


106,519 


506,484 


Tile Calorimetry 


8 


51,694 


35,227 


75,745 


1,050,977 




Inner Detector 


243 


12,501 


18,440 


56,838 


1,824,614 


Calorimetry 


73 


52,366 


35,864 


182,262 


1,557,459 


Muon System 


22 


33,594 


9,467 


76,945 


1,424,768 


ATLAS TOTAL 


327 


98,459 


63,769 


316,043 


4,806,839 



A geometry database stores all fundamental constants 
for detector construction. Volume dimensions, rotations, 
and positions, as well as element and material properties 
including density, are all stored as database entries. New 
detector-specific tags may be created for inclusion in a 
global ATLAS geometry tag, where different tags gener- 
ally correspond to different detector geometry revisions. 
At run time, the user can select a global geometry tag 
as well as detector-specific geometry tags to create the 
desired geometry. In addition to constants for detector 
construction, the geometry database contains links to ex- 
ternal data files that may store, for example, magnetic 
field maps. These files are shipped with software distri- 
bution kits to production sites. By using links through 
the database, it is possible to select a magnetic field map 
based on the chosen geometry layout. The selection of field 
map based upon the name provided in the database, for 
example, can be overridden with job options. 

A separate conditions database stores detector condi- 
tions data which are indexed by intervals of validity and 
tags. The entire detector may be optionally misaligned 
with a global misalignment tag, and the user may config- 
ure the job to use specific misalignment versions for each 
subdetector. The global misalignment is used frequently 
to study the performance of the entire ATLAS detector 
with misalignments of the expected as-built magnitude. 
The detector-specific misalignments allow studies of the 
effects of misalignment of a single subdetector assuming- 
ideal alignment of the remainder of ATLAS. The inner de- 
tector, for example, has completed an alignment challenge, 
wherein simulated data was produced with misalignments, 
and the analysis group was challenged to align the detec- 
tor as with data. The tile calorimeter, on the other hand, 
does not use any misalignments in its geometry. A variety 
of misalignments have been used in the lead-up to data 
taking in order to speed the process of global detector 
alignment and improve early physics studies. 

During data collection, the alignment constants of the 
detector are recorded periodically in the central conditions 
database. The user is able to recreate the misalignment 



conditions for a specific run by selecting an alignment ver- 
sion, again by subdetector if desired, at run time. 



4.3 GeoModel and Translations 

The ATLAS simulation, digitization, and reconstruction 
each run in distinct jobs, but they must be able to use 
the same detector geometry. Therefore, a complete geom- 
etry description is maintained that can be used by each 
step and is not specific to any. By using the geometry da- 
tabases, it is already possible to read identical detector 
constants and run conditions. 

For these reasons, ATLAS uses GeoModel 



67 , a li- 



brary of basic geometrical shapes, to describe and con- 
struct the detector. GeoModel contains geometry features 
similar to those of Geant4: basic volumes can be con- 
structed, rotated, and shifted in space; subvolumes can 
be placed inside a volume; boolean volumes can be made 
by adding or subtracting primitives; volumes can be pa- 
rameterized and repeated. For the digitization and recon- 
struction, this detector description is entirely sufficient to 
place hits, reconstruct tracks and objects, and complete 
all necessary calculations. 

The GeoModel descriptions of most ATLAS subsys- 
tems are built using constants in the geometry database. 
However, a translator has been constructed that parses 
an XML description of a detector's geometry and builds a 
transient representation from GeoModel primitives at run 
time. This generic package can translate any valid XML 
description of detector geometry into GeoModel format. 
It has been used for describing the geometry of the muon 
system's rather complicated dead material. 

For the simulation, the geometry is translated entirely 
from the GeoModel to the Geant4 format. All volumes 
and subvolumes are translated, constructed, and properly 
placed within the "world volume" (the volume allocated 
for the detector, at the edge of which particles cease to 
be simulated). All information tied to GeoModel, includ- 
ing position, rotation, and dimensions, are also translated 
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into a Geant4 equivalent. Once the geometry has been 
translated, all subsystems rely solely on their Geant4 
description. The GeoModel geometry is currently main- 
tained in memory for the entirety of the job, though it 
may be released to ease memory pressure in the future. 
As shown in Table |4j this release can be expected to save 
100 MB of mem ory Sensitive detectors and particle range 
cuts (see Section |5~5] ), for example, are tied to the Geant4 
geometry by volume name and can be added at any time 
after the geometry has been constructed. Each change in 
detector description is particularly weighty in simulation, 
because any additional volumes must be built both in Ge- 
oModel and in the Geant4 geometry. 



4.4 Alternate Layouts 

In addition to the standard detector layouts, several com- 
missioning layouts are available to the user for simulation 
of cosmic ray data taking. During detector assembly, cos- 
mic ray data were taken for several runs using as many 
subdetectors as were available. Some of these subdetector 
configurations included calorimeter endcaps shifted out of 
position while the inner detector was being accessed and 
were missing large portions of the beam pipe that had 
not yet been installed. One such commissioning layout is 
shown in Figure [3| For studies of cosmic rays and cavern 
background, it is possible to simulate the ATLAS cavern 
surrounding the detector as well as the bedrock surround- 
ing the two shafts leading down from the surface. 

Several different magnetic field configurations are also 
available for some of the full detector layouts. Fields with 
the toroidal magnets on and solenoid off or solenoid on 
and toroidal magnets off are provided. These magnet con- 
figurations have already been used for some cosmic ray 
data taking runs and may be used for brief periods during 
high-energy collisions as well. Field maps have also been 
constructed that reflect the as-built misalignments of the 
magnet system, for example a vertical shift of a 1.6 mm 
in the solenoid. 

There are also several test stand layouts that were con- 
structed to model test beam and standalone cosmic ray 
runs. A sample of these various test stands, including sub- 
systems, incident particles, and energies, are listed in Ta- 
ble [5] A combined test beam run was taken with a wedge 



of the full detector during 2004 73 , and standalone test 



beams were constructed for the muon detectors 74 75 



77 
Figure|4 
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tile calorimeter 76 , and liquid argon calorimeter subsys- 
tems 



The combined test beam setup is shown in 
Cosmic ray data were also collected with vari- 
ous pieces of the inner detector 79 80 and with the muon 



chambers both prior to and after installation. 

All test stand and commissioning layouts are available 
as a part of the same geometry infrastructure and can be 
selected at run time for simulation. By maintaining all de- 
tector configurations as a part of a common infrastructure, 
it is possible to ensure consistency between, for example, 
the test beam and full detector simulation. Conclusions 
drawn from analysis of the test beam simulated data are 
generally still valid for the full detector simulation. The 



extensive tuning of the detector simulation and digitiza- 
tion on test beam data can be applied directly to the full 
detector. As many common elements as possible are kept 
between the two, including Geant4 version and physics 
list (see Section [5]). 



5 Core Simulation 

The standard simulation of ATLAS relies on the Ge- 
ANt4 particle simulation toolkit. Geant4 provides mod- 
els for physics and infrastructure for particle transporta- 
tion through a geometry, but several ATLAS-specific pieces 
are provided as user-code. The detector geometry itself is 
constructed in the Geant4 format, and all particle scor- 
ing (done in "sensitive detector" classes) are done on the 
Athena side. Each subsystem's scoring is optimized and 
tailored to store only what is necessary for accurately 
reproducing the performance of that particular subsys- 
tem [8l}|88] . Athena code is necessary to add to the Monte 
Carlo truth record. Physics models are chosen and pa- 
rameters optimized for the ATLAS detector. The results 
shown in this paper used Geant4 version 8.3 with official 
patch #2 and two modifications: updates for boundary 
represented volumes and a patch to the G4Tubs code. The 
software is continuously evolving, and ATLAS has moved 
to newer Geant4 versions since the writing of this paper. 

The Geant4 Collaboration and ATLAS Simulation 
Group have benefitted from 15 years of close collaboration. 
Frequently, new Geant4 features have allowed faster or 
more realistic simulation of the ATLAS detector. Feature 
requests from the ATLAS collaboration have helped drive 
the development of Geant4. The ATLAS simulation has 
also provided one of the more complicated test-beds for 
the Geant4 toolkit, and Geant4 has been extensively 
evaluated and validated during large-scale simulation pro- 
duction. 

In order to provide Python flexibility to the Geant4 
simulation, an additional layer of infrastructure is neces- 
sary. "Standard" Geant4 simulation typically runs from 
compiled C++, and in order to modify any of the param- 
eters or the geometry used in the simulation it is neces- 
sary to recompile. The Framework for ATLAS Detector 
Simulation (FADS) [89] wraps several Geant4 classes in 
order to allow selection and configuration without recom- 
pilation of any libraries. Since a Python interface is used 
for configuration, all the usual introspection capabilities of 
Python may be employed. FADS wraps Geant4 base- 
classes for volumes, materials, and sensitive detectors for 
hit processing as well as Geant4 physics process defini- 
tions. These wrappers serve a dual purpose: first, they ease 
translations between the Geant4 and Athena standards 
of geometry, hits, and particle storage. Second, FADS can 
catalogue the options available to the user, loading only 
those that will be needed for the desired simulation config- 
uration while still providing all possibilities without any 
recompilation. Through FAD S, a user is free to select a 
physics list (see Section 5.4) for use during the simula- 
tion. The user may also modify the physics list by adding 



25 




Fig. 3. Commissioning layout of the detector used for cosmic ray data taking during 2008. The endcap toroidal magnets and 
beampipe are not yet installed. The calorimeter endcaps (purple) are shifted by 3.1 m and the muon endcaps (green) are shifted 
to provide access to the inner detector during installation. The barrel toroid magnets are shown in yellow, and the inner detector 
is shown in blue. 

Table 5. Examples of test stands for ATLAS simulated using Geant4. 



Subsystem 



Incident Particle Energy 



Hadronic Endcap calorimeter 
Electromagnetic Barrel calorimeter 
Electromagnetic Endcap calorimeter 
Combined Endcap calorimeter 
Hadronic Barrel calorimeter 
Entire detector endcap wedge 
Muon Detectors 
Silicon Pixel Tracker Endcap 
Silicon Strip Tracker Barrel 



e +/- 
s +/- 
„+/- 

7T + /- 
s +/- 

+/- 



r+/- 



T + f- 



T + f- 



vl- 



,+/- 



Cosmic Rays 
Cosmic Rays 



+/- 



6-245 GeV 
10-245 GeV 
10-200 GeV 
6-200 GeV 
5-350 GeV 
1-350 GeV 
20-350 GeV 
0.5-200 GeV 
0.5-200 GeV 



particles or processes not included in the Geant4 tool- 
kit but included in the FADS catalogues, for instance in 
the simulation of long-lived exotic particles. Similarly, the 
detector description is configured with Python dictionar- 
ies and FADS catalogues before it is built in Geant4 and 
may be modified by the user. For example, sensitive detec- 
tors may be assig ned to any volume in the detector. Range 
cuts (see Section 5.5) may also be added in the Python 
and FADS layer prior to their being applied to any con- 
structed Geant4 geometry. Once the Python configura- 
tion is complete, FADS objects are translated into their 



Geant4 equivalents and loaded. Even after this transla- 
tion, they can be modified through the standard Geant4 
user interface. 

In order to fit into the Athena framework, a service for 
Geant4 and an algorithm that calls the service during the 
event loop have been implemented 
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The service wraps 
the event loop of Geant4 and provides a few additional 
handles for user-configuration in the Python layer. The 
service also takes care of initialization and finalization of 
each Geant4 event. The generated events are translated 
from HcpMC format into the standard Geant4 event for- 
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End-cap muons 
MDT and TGC 



Barrel muons 
MDT, RPC and CSC 



Magnet 



SCT 
(2x4 modules) 




Pixel 
(2x3 modules)! 



Extended barrel tile calorimeter 



LAr barrel 
Barrel TRT electromagnetic calorimeter 
(2x3 modules) 

\ 

Magnet 
(1 .4 Tm) 



Fig. 4. Combined test beam setup from 2004. 



mat prior to each event, and at the end of each event 
an analysis is done to ensure that the simulation finished 
without errors. Most of the functionality of the standard 
Geant4 run manager is included in this service, so that 
any Athena-specific modifications (e.g. event translation 
from HepMC to Geant4 format) to the usual Geant4 
event sequence can be made. The service also provides for 
interaction with Geant4 through its standard user inter- 
face. 

This section describes the possible user inputs, initial- 
ization, output, and various parameters of the simulation. 
Several useful features, including visualization, are also 
described. 



5.1 Simulation Input 

The ATLAS simulation offers a choice for event genera- 
tion. Events can be read from a file produced by any of 
the generators described in Section [3j one of the exter- 
nal generators can be configured and run concurrently; or 
commands can be provided for a single particle genera- 
tor. The single particle generator can produce particles 
by the particle PDG identifier 
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position and momentum. Neutral and charged geantinos 
(pseudo-particles without any interactions) are available 
for making material depth maps of the detectors and for 
debugging. The user may also choose to skip a certain 
number of events at the beginning of an input file, allow- 
ing 20 simulation jobs of 50 events each to a 1000 event 
input file without overlap or repetition. 



Several cuts and transformations can then be made 
to the event. The vertex position is smeared to represent 
the luminous region within ATLASQ It can be shifted if 
the user desires, but both the shift and smear are given 
initial default values that represent ideal collisions within 
the ATLAS detector. The generated event can be rotated 
in any direction, though only rotation in <j) is physical. 
Primary particles are only passed through the detector 
simulation if they are within a specified range in 77 — <f>. By 
default, primary particles with \r}\ > 6.0 are not simulated 
to save time. This cut was chosen to ensure consistent 
response in the forward calorimeter: a cut at \r)\ = 6.0 al- 
lows a sufficient number of particles to scatter back into 
the forward detector from high-77 without requiring an un- 
acceptable amount of CPU time. Generally, an increase in 
one unit of pseudorapidity corresponds to an increase of 
40-120% in CPU time, so that it is not possible to simulate 
the very forward detectors like LUCID during a standard 
simulation job. Figure [5] shows the ^-dependence of the 
CPU time per event. The increase is approximately three- 
fold in tt events and eight-fold in minimum bias events 
from simulating particles in \rf\ < 3.0 to simulating parti- 
cles in I ^7 1 < 8.0. The difference between the two types of 
events is primarily because the majority of activity in the 
minimum bias events is forward, and there is considerable 
central (77 < 3.0) activity in the ti events. 



6 During early data taking, the beams will collide head-on. 
Therefore, no crossing angle is added to the simulated events 
for the time being. 
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in \r\\ < 8.0. The average of 200 simulated tt and minimum bias events was taken. Linear fits are overlaid. 



At run time, either through job options or in the pro- 
duction system described in Section |2.2| seeds for pseu- 
do-randoom number generators to be used by Geant4, 
Athena, and any particle generators can be set. Different 
pseudo-randoom number generators may be configured for 
each. Since all random number seeds can be controlled, a 
single job is entirely reproducible. The seeds can also be 
written to a file or read from a file, providing an additional 
level of reproducibilit}|j 

The user must also select a layout for the detector. As 
described in Section |4j several layouts of the full detector 
and various test stands are available. Combined test beam 
simulation requires such a different configuration that an 
independent but similar core drives the Python configu- 
ration and loading of user job options. The distinction is 
made at run time based on the detector or test stand con- 
figuration selected. The layout of the detector determines 
what other options are available to the user at run time. 
During simulation of the entire ATLAS detector, several 
additional options are available. For example, a neutron 
time cut (see Section |5T5[ ) may be enabled. The magnetic 
field may be enabled and the field map may be selected. 

The user may optionally select a set of run conditions 
for the simulation job, through which all options and flags 
are set and a pre-defined job is run. This option is partic- 
ularly useful for testing and debugging. 



5.2 Simulation Initialization 

Although the initialization in a standard Athena job oc- 
curs in a single step, for an ATLAS simulation job the 



initialization is broken into three steps to allow additional 
user intervention. Table [6] summarizes the processes that 
occur at each one of the three steps. The division of the 
initialization is such that most modifications to the simu- 
lation conditions can be accomplished in job options alone 
(i.e. without code modification). Normally, the user pro- 
vides job options and allows the initialization to progress 
unhindered. Some parts of the job, for example the detec- 
tor layout, are only loaded after the initialization has be- 
gun. In order to modify volumes after the layout is loaded, 
the user must intervene during the initialization. Only cer- 
tain commands are effective at each stage of the initial- 
ization, since some parts of the Geant4 simulation have 
been loaded and created while others have yet to be trans- 
lated from dictionaries. 

Stage one of the initialization occurs as soon as Athena 
is started. Several external Python modules are loaded 
that provide basic functionality for any Athena job. The 
job properties provided by the user are read during this 
phase and are locked. Once the job properties are locked, 
any significant modification to the running of a simulation 
job must be done by directly accessing the affected services 
and algorithms. This saves propagation of changes in the 
case of a late modification to a job property. Metadata 
that will be stored with the hit output file are gathered. 
External dependencies that require early initialization are 
loaded, providing a service for GeoModel, a service for 
database interaction, and a service for frozen showers (see 
Section 7.1 ). The event generation mode (reading external 



7 Because of caching in Geant4, it is not possible to repro- 
duce an individual event or track without starting from the 
beginning of the job. 



events, generating events from an external generator, or 
generating single particles on-the-fly) is determined, and 
any necessary configuration is included for the generator. 
A stream is opened for the output hit file, if necessary, and 
hit containers for each enabled subdetector are added to 
the new file. Finally, a service is created to interface with 
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Table 6. Initialization sequence for the ATLAS simulation. Dividing the necessary configuration into several distinct steps 
allows user intervention at critical points. 

Init. Stage Processes 

1 External modules loaded, job properties locked, metadata written, event generation con- 
figured, hit file initialized, Geant4 service created 

2 Detector, physics regions, range cuts created, GeoModel geometry translated, truth 
strategies initialized, magnetic field loaded, physics list selected, user actions initialized 

3 Fast simulation models assigned, physics regions constructed, sensitive detectors as- 
signed, Geant4 run manager and physics models initialized, recording envelopes and 
visualization initialized 



and control Geant4, although at this point Geant4 is 
not fully initialized. 

Stage two of the initialization begins with the construc- 
tion of the detector in Python dictionaries. Dictionaries 
of physics regions, range cuts, and volumes in which to ap- 
ply step limitation (see Section 5.5 1 are constructed, and 



all key properties are assigned to the detector facilities 
or built in dictionaries for later addition to the geometry. 
Each enabled piece of the ATLAS detector or test setup 
is then recursively constructed in GeoModel according to 
the parameters specified in the geometry and conditions 
databases. After each subdetector has been constructed in 
GeoModel, it is translated recursively into an equivalent 
Geant4 geometry. After this point in the initialization 
all volumes and regions are available to the user for mod- 
ification. Next, the Monte Carlo truth strategies (see Sec- 
tion [5T3J) are added to the simulation. The magnetic field 
is then loaded. Under normal circumstances, the field is a 
map loaded from an external data file, the name of which 
is specified in the geometry database according to the ge- 
ometry layout selected. The user may optionally choose 
to load data from one of the magnetic field test configu- 
rations, rather than the standard ATLAS magnetic field, 
or to create a ne w, basic magnetic field. The physics list 
(see Section 5.4) to be used for simulation is also set at 
this point. 

User actions are then initialized. Geant4 allows a user 
to insert pieces of code in various places throughout the 
simulation event loop, including after each step, when each 
track is queued ("stacked"), before and after each event, 
before and after each track is simulated, and before and 
after each rurj^J By default, ATLAS includes user actions 
that monitor simulation time, memory, and the number 
of tracks generated during each event. A neutrino cut (see 
Section 5.5 1 is also implemented as a track-stacking action. 



Whenever a new track is queued, its type is checked. The 
LAr calorimeters also use end-of-event actions for merging 
hits to save spac e pri or to storage. All truth storage strate- 
gies (see Section 5.3 ) are implemented as stepping actions 
that store interesting interactions based on the type of 
process, detector region, and energies of the particles in- 
volved. Users may also configure their own actions and 



add them to the simulation in the same way. Examples 
have been constructed for integrating interaction lengths 
or radiation lengths through the detector when making 
geantino maps, for stopping or killing particles if certain 
conditions are met, and for turning on additional output 
only under specific conditions in order to study a bug or 
issue without having to sift through enormous log files. 

Stage three of the initialization completes the job prep- 
aration. During this stage, the fast simulation models are 
built and added to the volumes to which they have been 
assigned. Any physics regions that will be used are con- 
structed. Sensitive detectors are built and assigned to the 
regions of the detector that are to be made sensitive (i.e. 
in which hits will be stored). Geant4's run manager and 
physics models are initialized. Recording envelopes are 
added (see Section 5.3), and any visualization that has 
been enabled by the user is initialized (see Section 5.7). 



Once the initialization is complete and all the neces- 
sary elements have been loaded into memory, the event 
loop begins. 



5.3 Monte Carlo Truth Information 



The Geant4 simulation adds to the Monte Carlo truth 



record already defined during generation (see Section 3.6 ) 



Far too many secondary tracks are produced during detec- 
tor simulation to store information for every interaction. 
Only those interactions which are of greatest relevance 
to physics analyses are saved, according to several saving 
rules ("strategies"). Most are applicable only to the in- 
ner detector. For each interaction that satisfies any of the 
storage criteria, the incoming particle, step information, 
vertex, and outgoing particles are included in the truth 
record. Later in the software chain, individual track seg- 
ments are recombined so that, for example, a single elec- 
tron that undergoes several bremsstrahlung events along 
its path is counted as only one "true" particle. 

The strategies include (with all cuts on kinetic en- 
ergy f\ 



Here, run is used in the Geant4 sense to refer to a finite set 
of events within a simulation job. Several runs may comprise a 
job, and each run may include an arbitrary number of events. 



9 For the most recent production, cuts are applied on trans- 
verse momentum, pr > 100 MeV, rather than on kinetic en- 
ergy. The lower cut allows for a study of tracking performance 
in minimum bias events where it may be possible to reconstruct 
tracks down to only a few hundred MeV. 
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— In the inner detector, bremsstrahlung vertices are stored 
if the primary electron or muon has an energy above 
500 MeV and the photon produced has an energy above 
100 MeV. 

— In the inner detector, ionization vertices are stored if 
the primary particle has an energy above 500 MeV and 
the electron generated has an energy above 100 MeV. 

— In the inner detector, hadronic interaction vertices are 
stored if the primary particle has an energy above 
500 MeV. 

— In the inner detector, decay vertices are stored if the 
decaying particle has an energy above 500 MeV. 

— In the inner detector, the conversions of photons above 
500 MeV are stored. 

— In the calorimeter, muon bremsstrahlung vertices are 
stored if the primary muon's energy was above 1 GeV 
and the photon generated is above 500 MeV. 

All cuts and regions of applicability are made config- 
urable, so that any energy cut-offs can be modified and 
a strategy can be assigned to any volume in the simula- 
tion, additional rules could be constructed, for example, 
for tracking shower development within the calorimeter, 
but many would consume too much CPU time and disk 
space for use in standard simulation jobs. 

Standard simulation jobs also define several volumes 
that are used to record all particles escaping part of the 
detector. All tracks above 1 GeV are typically recorded 
at the end of the inner detector, the end of the calori- 
meter, and the end of the muon system (and the end of 
the ATLAS world volume). It is possible for the user to 
configure the simulation at run time to add additional 
volumes to the list of these recording volume. In each case, 
tracks are saved as they exit each volume. 



5.4 Physics List 

Physics lists include all numerical models that describe the 
particles' interactions in the Geant4 simulation. Models 
are generally good for a single type of interaction and 
over a limited energy range. The Geant4 collaboration 
provides several combinations of these models that have 
been tailored to various scenarios as standard physics lists 
that ship with each distribution. In order to enhance re- 
producibility and ensure that validated combinations of 
models are used, only those physics lists provided by the 
Geant4 collaboration are used by the ATLAS simula- 
tion. One exception is allowed, namely transition radia- 
tion. Transition radiation is crucial for the tracking por- 
tion of the inner detector and is added to each physics 
list. 

There are several physics lists that are used by ATLAS: 

QGSPJ3ERT - the physics list used for all simulation pro- 
duction after 2008. The list includes the Quark-Gluon 
String Precompound model (QGSP) and the Bertini 
intranuclear cascade model (BERT) [4] as part of the 
hadronic physics package. The electromagnetic physics 
package includes step-limiting Multiple Coulomb Scat- 
tering (MSC). 



QGSP_EMV - the physics list used for simulation produc- 
tion before 2008. This list included the QGSP model, 
but without the Bertini cascade. The MSC of this list 
was not allowed to limit the step, so it is labeled an 
electromagnetic variant (EMV). 

QGSP_BERT_HP - the physics list used for neutron flu- 
ence studies and comparisons with the Fluka simu- 
lation package [92]. This list includes the QGSP and 
Bertini models, step-limiting MSC, and additional "high- 
precision" low-energy neutron physics models. 

A step limitation process that controls the maximum 
allowed step length of a charged particle was added in 
the inner detector when using the QGSP_EMV physics 
list. It helped the simulation to better reproduced test 
beam and cosmic ray data. The step-limiting MSC that 
is a part of QGSPJ3ERT was found to agree equally well 
with data, and therefore the step limitation was removed 
from simulation with QGSP_BERT. 

These physics lists were studied in detail for each sub- 
detector 93 . Table [7] shows the number of steps, number 
of hits in sensitive detector regions, and number of sec- 
ondary particles with kinetic energy above 50 MeV and 
1 GeV within several regions of the detector and for the 
whole of ATLAS using both the QGSP_EMV and QGSP- 
_BERT physics lists. Sensitive d etect ors to record calibra- 
tion hits (described in Section 5.6 ) are included in the 
cryostat of the LAr calorimeter. The average was taken 
of 50 ti events, where there were on average 482 pri- 
mary (generator-level) particles per event. The calorime- 
ter clearly dominates the total number of steps and hits 
in sensitive detector for both physics lists. The muon sys- 
tem, though it has a comparable number of hits, consists 
mostly of shielding and therefore has far fewer hits in sen- 
sitive detector regions. The numbers of steps divided into 
different process types for QGSP_EMV and QGSP_BERT 
are listed in Tables [8] and [9] In both cases, transportation 
processes dominate the inner detector simulation, while 
electromagnetic physics and transportation dominate the 
calorimeter and the muon system. 

Simulation time was also examined for each physics 
list. Simulation using the QGSPJ3ERT physics list con- 
sumes ^2.5 times more CPU time than does simulation 
with the QGSP_EMV physic s lis t. However, applying a 
neutron time cut (see Section 5.5) with the QGSPJ3ERT 



list reduces simulation time by more than 30%. Simula- 
tion with QGSP_BERT_HP requires approximately five 
times more CPU time than QGSP_EMV. Therefore, the 
QGSP_BERT_HP physics list cannot be used for standard 
simulation. 



5.5 Simulation Optimizations 

In order to optimize use of both disk space and CPU time, 
several other modifications are made to the standard Ge- 
ANT4 simulation 93,94]. 

Comparing the QGSP_BERT physics list to the QGSP- 
_EMV physics list, approximately three times as many 
neutrons are generated in typical hard scattering events, 
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Table 7. Number of steps, number of hits in sensitive detector (SD) regions, and number of secondary particles with kinetic 
energy above 50 MeV and 1 GeV within several regions of the detector and for the whole of ATLAS, using both the QGSP_EMV 
and QGSP-BERT physics lists. The average was taken of 50 ti events, where the average number of primary tracks per event 
was 482. Sensitive detectors to record calibration hits are included in the cryostat of the LAr calorimeter. 



QGSP EMV 


Steps 


Hits in SD 


Sec. above 


50 MeV 


Sec. above 1 GeV 


Inner Detector 
Calorimetry 
Muon System 


1.80 x 10 6 

1.87 x 10 7 
1.90 x 10 6 


3.10 x 10 5 

6.87 x 10 6 
1,030 


1,570 

39,900 

7,820 




260 

2,040 

332 


Total ATLAS 


2.24 x 10 7 


7.18 x 10 6 


49,300 




2,630 





QGSP.BERT 


Steps 


Hits in SD 


Sec. above 50 MeV 


Sec. 


Inner Detector 


2.13 x 10 6 


1.98 x 10 5 


1,450 


269 


Calorimetry 


3.93 x 10 7 


1.36 x 10 7 


40,100 


2,170 


Muon System 


2.69 x 10 6 


1,285 


8,210 


385 



above 1 GeV 



Total ATLAS 



4.41 x 10 7 1.38 x 10 7 



49,700 



2,820 



Table 8. Number of steps for various processes and detector regions during simulation with the QGSP-EMV physics list. The 
average was taken of 50 ti events, where the average number of primary tracks per event was 482. The "other processes" in the 
inner detector are primarily step limitation processes. 



Process 


Inner Detector 


calorimeter 


Muon System 


Transportation 


1.50 x 10 6 


9.33 x 10 6 


2.15 x 10 5 


MSC 


4,910 


1.09 x 10 5 


5,200 


Photoelectric Effect 


6,060 


1.32 x 10 6 


2.03 x 10 5 


Compton Scattering 


12,800 


1.43 x 10 6 


4.26 x 10 5 


Ionization 


1.08 x 10 5 


4.97 x 10 6 


8.10 x 10 5 


bremsstrahlung 


6,310 


1.28 x 10 6 


1.87 x 10 5 


Conversion 


434 


82,400 


17,300 


Annihilation 


291 


82,800 


17,800 


Decay 


254 


2,320 


538 


Other Hadronic Interaction 


1,710 


1.23 x 10 5 


21,600 


Other Process 


1.56 x 10 5 


4,800 


831 


Total 


1.80 x 10 6 


1.87 x 10 7 


1.90 x 10 6 



and they travel approximately three times further. These 
neutrons cause an increase in the output hit file size of 
approximately 75% as well as an increase in CPU time 
per event for hard scattering events. A Geant4 neutron 
time cut is, therefore, applied which removes all neutrons 
150 ns after the primary interaction. This was found to be 
sufficient time for the hadronic shower development and 
did not degrade the energy scale or energy resolution of 
the calorimeters. Output files are the same size when us- 
ing the QGSP_BERT physics list with this cut enabled as 
they are when using the QGSP_EMV physics list with- 
out a neutron time cut. The simulation time required for 
QGSP_BERT is reduced by 10-15% when the neutron cut 
is enabled. 

Neutrinos are also removed as soon as they are created 
in the simulation. No particle is allowed by Geant4 to 
step through more than one volume at a time. Therefore, 
neutrinos may require several thousand steps to exit the 
entire ATLAS detector. They may therefore consume a 



noticeable fraction of simulation time, even though their 
interaction probability is practically null. The removal is 
done when the particles are stacked. 

Range cuts are Geant4 parameters that control the 
creation of secondary electrons or photons during brems- 
strahlung and ionization processes. If the expected range 
of the secondary is less than some minimum value, the 
energy of that secondary particle is deposited at the end 
of the primary particle's step and no separate secondary 
is produced. Effectively, this parameter defines an energy 
scale at which particle propagation may be ignored. By 
increasing the range cuts throughout the detector one can 
decrease the CPU time required per event. Particularly 
near boundaries and thin materials, the detector's sam- 
pling fraction may be affected if the range cuts are too 
large. Range cuts can be specified separately for electrons, 
positrons, and photons, but in ATLAS the same distance 
is used for all three. Range cuts are specified as a distance, 
and for each material the distance is translated into an en- 
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Table 9. Number of steps for various processes and detector regions during simulation with the QGSP J3ERT physics list. The 
average was taken of 50 it events, where the average number of primary tracks per event was 482. The "other processes" in the 
calorimeter and muon system are primarily neutron killer processes. 



Process 


Inner Detector 


calorimeter 


Muon System 


Transportation 


1.76 x 10 6 


1.46 x 10 r 


2.31 x 10 5 


MSC 


2.31 x 10 5 


1.48 x 10 7 


5,200 


Photoelectric Effect 


6,760 


1.37 x 10 6 


2.32 x 10 5 


Compton Scattering 


14,800 


1.66 x 10 6 


5.03 x 10 5 


Ionization 


1.03 x 10 5 


4.81 x 10 6 


9.71 x 10 5 


bremsstrahlung 


6,060 


1.22 x 10 s 


1.92 x 10 5 


Conversion 


416 


86,800 


18,100 


Annihilation 


271 


87,000 


18,500 


Decay 


212 


1,670 


402 


Other Hadronic Interaction 


2,190 


6.66 x 10 5 


1.23 x 10 5 


Other Process 


426 


25,400 


5,720 


Total 


2.13 x 10 6 


3.93 x 10 7 


2.69 x 10 6 



ergy based on the average energy loss of a particle in that 
material. For the majority of the ATLAS detector, range 
cuts take a default value of 1 mm. Exceptions are listed 
in Table |10| Deviations usually occur in sensitive volumes 
that are very thin, where it is important to correctly cal- 
culate the sampling fraction of the detector or model the 
energy deposition. Reduced range cuts are also applied to 
very thin volumes that are adjacent to sensitive volumes 
for the same reason. In the monitored drift tube muon 
chambers, for example, range cuts are only reduced in the 
thin aluminum tubes surrounding the sensitive detector 
(gas) - the gas itself takes the standard 1 mm cuts. In 
some shielding volumes it may be possible to relax range 
cuts considerably without degrading physics performance. 

Geant4 uses a set of parameters to control errors 
and accumulated biases on charged particles transporta- 
tion through a magnetic field. Because the equation of 
motion is solved numerically, the user must select the 
numerical integration method to be used, including the 
order of integration, and the tolerances on the errors of 
the step. ATLAS has chosen to use the Geant4 stan- 
dard fourth-order Runge-Kutta method with the default 
error parameters for the majority of the detector. These 
parameters are generally satisfactory and result in errors 
and biases that are less than the position resolution of the 
detector. In the inner detector, however, tracks were found 
to be shifted sufficiently that detector residuals were af- 
fected. Here, stepping parameters were tightened by an 
order of magnitude. Further optimization of the stepping 
algorithms of Geant4 has been undertaken, including the 
configuration of the choice of stepper and stepping pa- 
rameters as a function of the initial particle type, energy, 
and position within the detector. Such a configuration can 
allow more careful stepping of muons in the calorimctry 
without degrading the total performance of the simulation 
significantly. Muons in particular can accumulate a signif- 
icant bias after passing through all the sampling layers of 
the calorimeter, making more accurate tracking necessary. 
As a fourth-order stepper requires four values of the mag- 



netic field to be calculated, optimization of magnetic field 
map access will also be key to improving the performance 
of the simulation's tracking. 



5.6 Hit Storage Format 

The output from the simulation is a hit file, containing 
some metadata describing the configuration of the sim- 
ulation during the run, all requested truth information, 
and a collection of hits for each subdetector. The hits are 
records of energy deposition, with position and time, dur- 
ing the simulation. Each subdetector is responsible for im- 
plementing their own sensitive detector for the selection, 
processing, and recording of these hits. In most subsys- 
tems, including the inner detector and muon system, this 
consists simply of recording all hits that occur in sensitive 
regions of the detector for subsequent storage. Some ad- 
ditional manipulation is done at the end of each event to 
compress the output as much as possible; still, the files are 
typically 2 MB per event for hard scattering events (e.g. 
tt production). 

The file size is large, mostly due to the inner detector, 
for which the majority of hits are independently stored. 
Merging hits there is difficult, since they tend to be iso- 
lated and cannot normally be merged across readout chan- 
nels. These consume typically 60% of the disk space in 
a hit file (e.g. 65% of the hit file for tt events). In the 
calorimetry, there are far too many hits created by elec- 
tromagnetic and hadronic showers for the individual stor- 
age of a four-vector for each (see Section 5.4). Instead, hit 
merging occurs at the end of each event. By optimizing 
time binning, hits can be compressed to a large extent. 
About 10% of the hit file is consumed by optional "cali- 
bration hits" for the calorimeters, hits in dead material, 
stored to improve the detector calibration and missing en- 
ergy calculation and to study simulation-based calorime- 
ter calibration schemes. Under normal circumstances, the 
muon systems contribute a negligible portion of the hit 
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Table 10. Range cuts for detectors that do not take the ATLAS default of 1 mm. 



Subdetector 



Range Cut Value 



Silicon pixels and strips in the inner detector 

Gas in the transition radiation tracker 

Electromagnetic Barrel and Endcap calorimeters 

Forward calorimeter (all compartments) 

Aluminum tubing of monitored drift tube muon chambers 



0.05 mm 
0.05 mm 
0.1 mm 
0.03 mm 
0.05 mm 



file. The contributions by subdetector can be found in Ta- 
ble 11 for the average of 50 simulated tt events. Here and 
elsewhere in this paper, file sizes are without compression 
and are taken from ROOT 
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In practice, compression 
reduces the actual disk space required for the files, but 
file-level metadata adds several hundred kilobytes. 

By comparing Tables [7] and |11| one can understand 
these numbers in terms of hits in the sensitive detector 
region. Although the muon system is large, the major- 
ity of it is shielding. Therefore, it collects far fewer hits 
than the other subsystems and requires less disk space 
for the hit records. The calorimetry produces 95% of the 
hits in sensitive regions during simulation. Because of the 
compression applied prior to storage, the calorimetry com- 
prises only 25% of the hit file. 



5.7 Visualization 



Visualization is used to understand anomalies or features 
in odd events, occasionally to debug errors due to geom- 
etry, and to check for overlaps and touching volumes in 
the geometry that can be spotted by eye. Although Ge- 
ANt4 contains viewing software of its own, because the 
geometry of ATLAS must be translated from GeoModel 
into Geant4 format it is useful to use a viewer that can 
construct geometry directly from its GeoModel descrip- 
tion. A gener al purp ose three-dimensional event display 
program, VPlFj J96], has been developed specifically for 
ATLAS. It is optimized specifically for the visualization 
of the ATLAS geometry and is arguably the most useful 
tool for understanding and debugging of detector descrip- 
tion across all ATLAS subsystems. Two examples of VP1 
event displays are in Figures [6] and [7] Ray Tracer, the Ge- 
ANt4 visualization utility, has also been used to visualize 
portions of the detector containing some exotic shapes. 

VP1, as well as other event display programs used in 
ATLAS (e.g. Atlantis 



|97| and Persint 98 ), are mostly 
used for visualizing real and simulated events after they 
have run through the common reconstruction software. 
The VP1 viewer can be injected directly into the simula- 
tion job in order to visualize events immediately after the 
simulation step. 



6 Digitization 

The ATLAS digitization software converts the hits pro- 
duced by the core simulation into detector responses: "dig- 
its." Typically, a digit is produced when the voltage or 
current on a particular readout channel rises above a pre- 
configurcd threshold within a particular time- window. Some 
subdetector's digit formats include the signal shape in de- 
tail over this time, while others simply record that the 
threshold has been exceeded within the relevant time win- 
dow. 

The peculiarities of each subdetector's charge collec- 
tion, including cross-talk, electronic noise and channel- 
dependent variations in detector response are modelled in 



subdetector-specific digitization software 79 82 99 ■ 101 



10 ATLAS is at Point 1 of the LHC ring. The name VP1 is 
short for Virtual Point 1. 



The various subdetector digitization packages are steered 
by a top-level Python digitization package which ensures 
uniform and consistent configuration across the subdetec- 
tors. The properties of the digitization algorithms were 
tuned to reproduce the detector response seen in lab tests, 
test beam data, and cosmic ray running. Dead channels 
and noise rates are read from database tables to reproduce 
conditions seen in a particular run. In some cases, dead 
channels are removed during the reconstruction step. 

The digits of each subdetector are written out as Raw 
Data Objects (RDOs). For some subdetectors this requires 
the digits produced to be converted to RDOs by a second 
algorithm during the digitization process. For others there 
is no intermediate digit object and RDOs are produced 
directly from the hits. In addition to RDOs, the digitiza- 
tion algorithms can also produce Simulated Data Objects 
(SDOs). These SDOs contain information about all the 
particles and noise that contributed to the signal produced 
in the given sensor and the amount of energy contributed 
to the signal by each. The relationship between RDOs and 
SDOs depends on the particular subdetector. For example, 
in the SCT each RDO represents a group of consecutive 
strips which recorded a hit, whereas one SDO is produced 
for each strip where energy was deposited by a particle 
in the Monte Carlo truth tree. No SDOs are created in 
the calorimeter. SDOs are mainly used for determining 
tracking efficiency and fake track rates. 

Simulating the detector readout in response to a sin- 
gle interesting hard scattering interaction is unrealistic. 
In reality, for any given bunch crossing there may be mul- 
tiple proton-proton interactions. In addition to the hard 
scattering which triggers the detector readout, many in- 
elastic, non-diffractive proton-proton interactions may ap- 
pear. These interactions must be included in a realistic 
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Fig. 6. An event display made with VP1. A Higgs boson decays into four muons (shown in red). Inner detector tracks are in 
green, and energy deposited in the calorimeter by the muons is shown in yellow. 




Fig. 7. A Higgs boson decaying into four muons, with only the inner detector tracks and hits in the TRT being displayed by 
VP1. 
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Table 11. Hit collection size, in kB per event, by subdetector. The average was taken of 50 simulated tt events, calorimeter 
calibration hits are hits in the dead material of the calorimeters stored for studying simulation-based calorimeter calibration 
schemes. 



Collection Name 



Size [kB/event] Percentage of File 



Silicon pixel tracker 
Silicon strip tracker 
Transition radiation tracker 
Electromagnetic Barrel calorimeter 
Electromagnetic Endcap calorimeter 
Hadronic Barrel calorimeter 
Hadronic Endcap calorimeter 
Forward calorimeter 
calorimeter calibration hits 
Muon system (all collections) 
Truth (all collections) 

Total 



82 

356 

921 

89 

104 

29 

22 

42 

243 

3 

134 

1987 



4% 

16% 

46% 

4% 

5% 

1% 

1% 

2% 

12% 

\1% 

7% 

100% 



model of detector response. The effects of beam gas and 
beam halo interactions, as well as detector response to 
long-lived particles, must be incorporated. These interac- 
tions are treated separately at the event generation and 
simulation stages. Within a digitization job, hits from the 
hard scattering are overlaid with those from the requested 
number of these additional interactions before the detector 
response is calculated. Because of long signal integration 
times, most subdetector responses are affected by inter- 
actions from neighboring bunch crossings as well. There- 
fore, additional interactions offset in time are overlaid as 
necessary. The overlaying off these various types of events, 



known collectively as "pile-up," is described in Section 6.2 



Before reconstruction can be run, bytestream data from 
the real detector must be converted into RDO format. As 
mentioned above, the digitization usually avoids this step 
by writing out RDOs directly. However, in order to do 
simulation studies with the High Level Trigger it is nec- 
essary to translate the RDO files into bytestream format. 
There is some loss due to truncation in the first conver- 
sion from RDO to bytestream, but the inverse operation is 
basically lossless. Having the ability to convert output in 
both directions also allows evaluation of the conversions 
themselves. 



6.1 Digitization Configuration 

The ATLAS digitization takes as input hit files produced 
by the ATLAS simulation. For pile-up simulation, there 
are also input hit files for each type of background inter- 
action to be overlaid. In such cases it is the main hard scat- 
tering event which sets the run number and event number. 
Run and event numbers from overlaid events are ignored. 
The digitization steering package exists entirely in the 
Python layer and configures how the digitization will be 
performed before the event loop starts. This configuration 
is highly flexible, but also ensures that sensible default val- 
ues are given for each configurable property of the job. In 



the configuration of digitization jobs, the user may spec- 
ify the number of events to digitize, the number of leading 
events to skip in the input file, the input hit file(s), and 
the output file. Digitization and writing out of RDOs may 
be enabled or disabled by subdetector. In order to ensure 
consistency, the detector layout version is, by default, read 
from the hard scattering events' hit file metadata. 

Digitization options also include the following: 

Detector Noise Simulation: Detector noise simulation can 
be turned off in the inner detector, calorimeter or muon 
spectrometer or any combination thereof. This is useful 
for data overlay jobs where noise is taken from real 
data events and for studies using a noise-free detector. 

Random Number Services: The type of random number 
engine to be used in all digitization algorithms can 
be specified (Ranlux64, the default, or Ranecu 102] ). 
Each algorithm has one or more random number streams. 
Random number seeds can be initialized from a text 
file or set in job options. The user may alternately 
specify an offset from the default values of the seeds, 
to be used in all streams. 

Metadata: In the default configuration, metadata from 
the simulation stage are used to configure the physics 
list (for setting the sampling fraction of the calorime- 
ters) and the detector layout. The metadata can be 
overridden. 

Pile- up Background Events: The overlay of minimum bias, 
cavern background, beam gas and beam halo events 
can all be configured separately. In each case the mean 
number of events (if any) per bunch crossing to be over- 
laid and a collection of files containing the events to 
be overlaid onto the signal events can be specified. 

Beam Properties: The LHC beam bunch spacing can be 
configured, as can the number of bunch crossings to 
overlay before and after the hard scattering event. 

Detector Conditions: Default detector conditions (includ- 
ing, e.g., dead electronics and noisy channels) are as- 
sociated with each detector layout. Non-default condi- 
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tions may be specified globally or by subdetector for 
use in digitization. 

After a check to make sure that at least one subde- 
tector has been left switched on, the input and output 
streams are initialized. GeoModel is initialized using the 
detector layout and conditions versions read from the hits 
file metadata or specified by the user. Setting the detec- 
tor layout version to be different from that used in the 
simulation is possible, but considered to be an expert ac- 
tion. The magnetic field service is initialized at this point. 
It is necessary because the magnetic field affects charge 
propagation from the active regions of the detector to the 
readout surfaces. 

At this point, caches for pile- up events are created and 
configured with the appropriate collection of hits files as 
well as the number of events to be overlaid per bunch 
crossing. These caches are controlled by an overall pile- 
up manager service. A second pile-up service is created 
to hold information about the time window within which 
interactions can affect the response recorded by each sub- 
detector. During the initialization stage, this information 
can be combined with the bunch spacing to calculate the 
number of bunch crossings which should be simulated for 
each subdetector for each event. 

Subsequently, the subdetector digitization algorithms 
are configured and added to the sequence of algorithms 
to be run in the job. The collections of RDOs, hits, and 
truth information which are to be recorded are added to 
the output stream. Digitization algorithms exist for the 
following subdetectors: 

Inner Detector: BCM, silicon pixel tracker, SCT, and TRT. 
calorimeter: LAr and tile calorimeters. Separate algorithms 

also exist to simulate the formation of trigger towers 

in the calorimeters, which serve as inputs to the level 

one trigger. 
Muon Spectrometer: Cathode Strip Chambers, Monitored 

Drift Tubes, Resistive Plate Chambers and Thin Gap 

Chambers. 

If requested, the level one trigger simulators are added to 
the algorithm sequence, provided that the digitization of 
the relevant parts of ATLAS have been turned on. The de- 
fault mode of simulation production is to run the level one 
trigger simulation during the reconstruction step rather 
than as part of the digitization step. 

As the digitization algorithm for each subdetector is 
configured, the names and seeds for the random number 
streams it requires are added to a list. In the case where 
seeds are to be read in from a file, the default list of stream 
names and their seeds are replaced by the file contents. 
Once all algorithms have been configured, the list is used 
to configure the random number service. Separate random 
number streams are used for each subdetector digitization 
algorithm and give the same result independent of what 
is used for the other subdetector^FM 



Much of the job configuration information, along with 
the detector layout version, is written to the output file 
as digitization metadata. The run number provided in the 
simulation metadata is used to establish a validity range 
for the digitization metadata corresponding to the cur- 
rent run only. At this point the digitization job is fully 
configured and the event loop begins. 



6.2 Pile-up 

To simulate pile-up, various types of events are read in, 
and hits from each are overlaid. The different types consid- 
ered can be configured at run time, and normally comprise 
signal, minimum bias, cavern background, beam gas, and 
beam halo events. The number of events to overlay of each 
type per bunch crossing may also be set at run time and 
is a function of the luminosity to be simulated. The mean 
number of interactions per bunch crossing (BC), for exam- 
ple 23 at the design luminosity of 10 34 cm _2 s _1 with 25 ns 
bunch spacing, depends linearly on luminosity and bunch 
spacing. However, this number is Poisson-distributcd, with 
a long tail beyond the most probable value. Thus, a sub- 
stantial fraction of the bunch crossings will have more 
than the average number of interactions. In addition, the 
ATLAS subdetectors are sensitive to hits several bunch 
crossings before and after the BC that contains the hard 



scattering event (which triggers the readout). Table 12 
shows the simulation window for each detector along with 
the corresponding number of bunch crossings for 25 ns 
and 75 ns bunch spacing. All of these detector and elec- 
tronic effects are taken into account during the pile-up 
event merging. 



6.2.1 Cavern Background 

Neutrons may propagate through the ATLAS cavern for 
a few seconds before they are thermalized, thus producing 
a neutron-photon gas. This gas produces a constant back- 
ground, called "cavern background," of low-energy elec- 
trons and protons from spallation. The cavern background 
consists mainly of thermalized slow neutrons, long-lived 
neutral kaons and low-energy photons escaping the calo- 
rimeter and the forward shielding elements. Muon detec- 
tors are most affected by high cavern-background rates. 
The radiation levels to be expected in the ATLAS cavern 
scale with luminosity, and they have been simulated as 
a function of r and z [103] for the design luminosity of 
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Depending on the type of radiation, exact 



composition of the equipment, and sensitivity of the study, 
the rates sometimes have to be increased by a safety fac- 
tor. Cavern background is produced in the following way: 



A standalone dedicated GEANT3/GCALOR-based 104 
detector simulation program with improved neutron 



11 Here "digitization algorithm" does not include the calori- 
meter trigger tower simulation algorithms, which require the 
corresponding calorimeter digitization to be performed. Simi- 



larly, the level one trigger simulation requires the simulation 
and digitization of the expected trigger inputs to give mean- 
ingful results. 
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Table 12. The time window (relative to the current bunch crossing) during which interactions in each subdetector are simulated 
during pile-up jobs, along with the corresponding numbers of bunch crossing simulated in the case of 25 ns bunch spacing and 
75 ns bunch spacing. 



Subdetector Simulation Window [ns] 



No. Bunch Crossings 
(25 ns bunch spacing) 



No. Bunch Crossings 
(75 ns bunch spacing) 



BCM 

Pixel trackers 

SCT 

TRT 

LAr calorimeter 

Tile calorimeter 

Muon chambers 



-50, +25 

-50, +25 

-50, +25 

-50, +50 

-801,+126 

-200,+200 

-1000,+700 



4 
4 
4 
5 
38 
17 



1 
1 
1 
1 

12 
5 

23 



propagation and a simplified ATLAS detector geom- 
etry is run on proton-proton collisions. The cavern 
walls are not included in the detector description. The 
output of this program includes particle fluxes in the 
envelopes surrounding muon spectrometer chambers. 
The fluxes are provided as list of particles with all re- 
lated parameters per proton-proton interaction at the 
entrance of each envelope. 

— The kinematic information of all particles generated 
by Geant3/GCALOR is converted to HepMC for- 
mat, and the flux is modified to be uniform in the 
time interval of the required bunch spacing (typically 
[0,25 ns]). 

— The simulation is then carried out using the full detec- 
tor geometry and Geant4, and hits are stored. 

— The cavern events are mixed, with a safety factor of 
up to 10, at the digitization level with the minimum 
bias and signal events. 

There are a number of issues with the current sim- 
ulation of the cavern background. The original primary 
cavern events were generated in an older version of Pyth- 
ia where the generated particle density is a factor of two 
lower than in the newer versions of Pythia [42JI43]. The 
statistics for the available cavern events are limited: 40,000 
events are available with a safety factor of 1; 10,000 events 
are available with a safety factor of 2 or 5; and 5,000 events 
are available with a safety factor of 10. Because of the lim- 
ited statistics, a number of monitored drift tubes in the 
muon detector fire more often than expected (i.e. there 
are spikes in the hit response of the detector) . Addition- 
ally, neutral particles are tracked through the entire detec- 
tor during simulation, thus producing additional hits from 
particles that should have been removed at the edges of 
the muon chamber envelopes (multiple counting). 

In the short term, the problem of limited statistics of 
the cavern events has been alleviated by taking advantage 
of the (^-symmetry of the muon spectrometer: the cavern 
events are rotated and re-simulated eight times or more 
(in multiples of eight). Further improvement in the avail- 
able cavern statistics can be achieved by repeating the 
simulation of the cavern events many times with different 
random number seeds, since the probability of a neutral 
interaction is very low, of the order 1%. 



6.2.2 Beam Halo and Beam Gas 

Beam halo is the background resulting from interactions 
between the beam and upstream accelerator elements. The 
flux from upstream (in the tunnel an d collima tors) is pro- 
vided by the LHC Machine Division 105 106 . Beam halo 



events are generated as discrete particle losses against the 
upstream collimators. The LHC machine division has esti- 
mated the proton loss rate in design conditions as being on 
the order of 1 MHz. Fluka simulation of the last 150 m of 
the beamline indicates that daughter particles from these 
proton losses will reach the cavern wall (23 m from the in- 
teraction point) at a rate of ~ 400 kHz. This flux is input 
to the normal Geant4 simulation to produce hit files. 

Beam gas includes the residual hydrogen, oxygen, and 
carbon gasses in the ATLAS beam pipe. Beam gas inter- 
action events are generated with Hijing (see Section [3.2.4 ) 
with appropriate time offsets. The interactions are allowed 
to take place anywhere in the beam pipe of ATLAS, 23 m 
in either direction from the interaction point. 



6.2.3 Pile-up with Real Data 

The pile-up mechanism described above will not work with 
real data, because it begins at the hit level. One must in- 
stead overlay events beginning from detector electronics 
output (RDOs). One may collect minimum bias, cavern 
background, beam halo, and beam gas backgrounds from 
the same "zero bias" trigger used to understand detector 
electronic noise. Then, one would overlay hits from sim- 
ulated hard scattering events onto the zero bias trigger 
data to simulate the pile-up. The zero bias trigger data 
needed for this type of event overlay can be selected at 
random from the filled-bunch crossings^] The subdetec- 
tors should be read out with as little zero-suppression as 
is possible and with the HLT in pass-through mode (i.e. 
without further filtering) . One can use bunch- by-bunch lu- 
minosity information to correctly weight the event sample 
for pile-up studies. 

In principle one needs as many zero bias events as 
generated events, but in practice zero bias events can be 

12 The zero bias trigger is not a minimum bias trigger 
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reused with independent simulated data sets without in- 
troducing any bias. During data taking, zero bias events 
are sampled at all times, because detector and cavern con- 
ditions are likely to vary with time. The zero-bias events 
could, for example, be collected exactly one orbit after a 
high-px trigger has fired. Appropriately pre-scaled to the 
output rate needed for simulations (on the order of 1-2 Hz, 
or about 1% of the recorded events), this means that the 
rate follows the luminosity and that the bunch structure 
is guaranteed to be right. 



6.3 RDO Storage Format 

The ATLAS detector electronics produce data in byte- 
stream format. The RDO format can be thought of as a 
POOL-compatible version of the bytestrearrp 3 ] The file 
size on disk is typically around 2.5 MB per event for hard 
scattering events (e.g. ti production) and increases in the 
presence of pile-up. Table [13] shows an example of the 
disk consumption by container for 50 ti events without 
pile-up and with pile-up at 10 33 cm _2 s _1 . In the absence 
of pile-up, one of the main consumers of disk space are 
the calibration hit collections, as described in Section [5~6} 
which are copied directly from the hit file to the RDO. As 
pile-up luminosity increases however the inner detector 
containers become increasingly significant. 



7 Fast Simulations 

Because of the complicated detector geometry and de- 
tailed physics description used by the ATLAS Geant4 
simulation, it is impossible to achieve the required sim- 
ulated statistics for many physics studies without faster 
simulation. To that end, several varieties of fast simula- 
tion programs have been developed to complement the 
Geant4 simulation. In this section, the standard Geant4 
simulation will be referred to as "full simulation." 

Almost 80% of the full simulation time is spent simu- 
lating particles traversing the calorimetry, and about 75% 
of the full simulation time is spent simulating electromag- 
netic particles. The Fast G4 Simulation aims to speed up 

The ap- 



this slowest part of the full simulation 107 108 



proach taken, therefore, is to remove low energy electro- 
magnetic particles from the calorimeter and replace them 
with pre-simulated showers stored in memory. Using this 
approach, CPU time is reduced by a factor of three in hard 
scattering events (e.g. ti production) with little physics 
penalty. This simulation may eventually become the de- 
fault simulation for all processes that do not require ex- 
tremely accurate modeling of calorimeter response or elec- 
tromagnetic physics. 

ATLFAST-I has been developed for physics parameter 
space scans and studies that require very large statistics 
but do not req uire the level of detail contained in the 
full simulation 109 110 . Truth objects are smeared by 



detector resolutions to provide physics objects similar to 
those of the reconstruction. Object four-vectors are out- 
put, without any detailed simulation of efficiencies and 
fakes. A factor of 1000 speed increase over full simulation 
is achieved with sufficient detail for many general studies. 
ATLFAST-II is a fast simulation meant to provide 
large statistics to supplement full simulation studies. The 
aim is to try to simulate events as fast as possible while 
still being able to run the standard ATLAS reconstruc- 
tion. ATLFAST-II is made up from two components: the 
Fast ATLAS Tracking Simulation (Fa tras) for the inner 

and the Fast 
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POOL compatibility requires separate transient and per- 
sistent object representation. 



detector and muon system simulation 
calorimeter Simulation (FastCaloSim) for the calorimeter 
simulation. Optionally, any subdetector can be simulated 
with Geant4 to provide the higher level of accuracy with- 
out the same CPU time consumption as full simulation of 
the entire detector. An improvement over full simulation 
time of a factor of 10 is achieved with full Geant4 in- 
ner detector and muon simulation and FastCaloSim, and 
a factor of 100 is achieved with Fatras and FastCaloSim. 



7.1 Fast G4 Simulation 

The Fast G4 Simulation reduces CPU time consumption 
without sacrificing accuracy by speeding up the slowest 
parts of the full simulation. By treating (as described be- 
low) electromagnetic showers in the sampling portions of 
the calorimeters, a reduction in CPU time of a factor of 
three can be achieved even in hadronic events. Although 
the calorimetry dominates simulation time for the full sim- 
ulation, after treatment of the electromagnetic showers the 
simulation time is evenly distributed throughout all sub- 
detectors. One particular advantage of this fast simula- 
tion over the other varieties is that its output file matches 
identically the format of the output of the full Geant4 
simulation. The data can therefore be run through the 
identical tests and digitization software following simula- 
tion, and the standard ATLAS trigger and reconstruction 
can be run. 

There are three treatments applied to electromagnetic 
showers. For very high energy (>10 GeV) electrons and 
positrons, a tuned shower parameterization is available. 
For medium energy (10 MeV to 1 GeV) electrons, posi- 
trons, and photons, libraries of pre-simulated showers can 
be applied during the event. For very low energy (<10 
MeV) electrons and positrons, a single hit can be de- 
posited to recreate detector response. Each one of these 
treatments can be turned on by the user in each compart- 
ment of the electromagnetic calorimeter and the forward 
hadronic calorimeter. The energy ranges can be set for 
each method, compartment, and particle. 

For electrons and positrons above a sufficiently high 
energy, around 10 GeV in the central calorimeters, the 
sampling calorimeter is sufficiently homogeneous to ap- 
ply a shower parameterization. Small steps are taken in 
the direction of the original particle, depositing energy ac- 
cording to several tuned functions as it traverses the detec- 
tor. The longitudinal profiles of showers are parameterized 
and normalized with an energy scale to approximate the 
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Table 13. Container size on disk in RDO files. Columns two and three show the average of 50 tt events digitized in the absence 
of pile-up. These events were chosen because they produce large energy deposits throughout the detector. Columns four and 



five, show the average of 50 tt events digitized in the presence of pile-up with 10 



luminosity and 25 ns bunch spacing. 



Category 


No Pile-up 


No Pile-up 




Pile-up 


Pile-up 




Space 


on Disk 


Percentage 


Space 


on 


Disk 


Percentage 




[kB/event] 


of File 


[kB/event] 


of File 


Inner Detector RDOs 




187 


7.6 






322 


11.5 


Inner Detector SDOs 




247 


10.0 






333 


11.9 


calorimeter Raw Channels 




995 


40.3 






1006 


35.9 


calorimeter Calibration Hits 




601 


24.3 






601 


21.4 


Muon Spectrometer RDOs 




1 


0.04 






27 


1.0 


Muon Spectrometer SDOs 




1 


0.05 






59 


2.1 


Level One Trigger 




289 


11.7 






300 


10.7 


Truth 




147 


6.0 






151 


5.4 


Headers 




>1 


0.01 






2 


0.1 


Total 




2469 


100.00 






2801 


100.00 



sampling fraction in each subdetector. The radial profile 
changes as a function of depth in the shower and is nor- 
malized by the longitudinal profile. Energy is deposited 
in hits in order to mimic the full simulation. Fluctuations 
are introduced in three separate places, representing the 
random characteristics of shower length and shape, the 
sampling resolution of the calorimeter, and the geometric 
fluctuations in the energy collected. 

Particles captured by the fast simulation in the ap- 
propriate energy range (typically <1 GeV) are replaced 
by a shower from a pre-simulated library, rotated and 
scaled to match the primary particle. Shower libraries are 
generated in bins of pseudorapidity and energy for elec- 
trons and photons. Only hits in the sensitive detectors 
are stored in order to save space on disk and in memory. 
The binning reproduces the fine structure in the calori- 
meters. The libraries are read into memory the first time 
they are requested by the simulation, ensuring minimal 
memory overhead. They consume about 200 MB when 
all are in use. Showers are randomly selected with linear 
weighting from the energy bin above or below the primary 
particle and from the pseudorapidity bin above or below 
the particle. The shower is then rotated to match the pri- 
mary particle's original direction, and the energy is scaled 
to match the primary particle's energy. For example, an 
electron at pseudorapidity of 2.37 and with an energy of 
12 MeV might use a shower from the electron and positron 
library's 2.4 bin in pseudorapidity and 10 MeV bin in en- 
ergy. The shower would then be moved and rotated to 
match the position and direction of the original particle, 
and its energy would be scaled up by 20%. 



of the hit is scaled to the response of the detector and 
smeared by its resolution. 

The standard combination of strategies is shown in 
Table [14] The strategies used in a particular subdetector 
is optimized for maximum CPU time improvement with 
minimal complexity. The upper energy bound for shower 
libraries, 1 GeV, balances memory use with speed. Li- 
braries at higher energies also may not correctly reproduce 
the tails of electromagnetic shower shape distributions as 
well as low-energy libraries do. The minimum energy for 
application of the parameterization model is based purely 
on CPU time. In most subdetectors, it is faster to allow 
Geant4 to produce 1 GeV secondaries and apply shower 
libraries to those secondaries than it is to apply the shower 
parameterization. The same argument applies to high en- 
ergy photons. They pair produce sufficiently quickly that 
treating them separately only adds complexity to the mod- 
els. The speed of the parameterization is limited by ran- 
dom number generation and locating hits within the de- 
tector geometry. 



7.2 ATLFAST-I 

ATLFAST-I performs a fast simulation of the ATLAS de- 
tector, including object reconstruction, in order to pro- 
duce high statistics samples of signal and background ev- 
ncts. The lowest possible CPU time per event is achieved 
by replacing detailed detector simulation with parameter- 
izations of the desired detector and reconstruction effects. 



Electrons and positrons with energies below about 10 MeVThe high speed of simulation in ATLFAST-I makes it pos- 



typically deposit only one hit in the sensitive region of 
the calorimeter. These particles are removed when inside 
the regular sampling region of any of the calorimeters 
("killed"), and a single hit is placed in the calorimeter. 
The position of the hit is determined by a random expo- 
nential number times the radiation length in the detector 
in order to approximate the particle's range. The energy 



sible to study channels where the statistics involved would 
otherwise be prohibitive. For example, the background to 
a Z — > t + t~ study from fake taus in di-jet events is ex- 
pected to require O(10 9 ) events in 100 fb _1 of data. Some 
searches also require many datasets to be simulated in or- 
der to scan across parameter space for the model being 
tested, such as SUSY. 



:W 



Table 14. The default combination of strategies used in the Fast G4 Simulation for each calorimeter compartment. 



calorimeter 



Parameterization Shower Libraries Killing 



Electromagnetic Barrel Not used 

Electromagnetic Endcap Not used 

Electromagnetic Forward > 9 GeV e+e~ 

Hadronic Forward > 1.5 GeV e+e~ 



10 - 1000 MeV e+e" 

< 10 MeV photons 
10 - 1000 MeV e+e" 

< 10 MeV photons 
10 - 1000 MeV e+e" 

< 10 MeV photons 
Not used 



< 10 MeV e+e" 

< 10 MeV e+e" 

< 10 MeV e+e" 

< 10 MeV e+e" 



ATLFAST-I is the least detailed simulation method. 
There is no realistic detector description, so studies of 
detector-based quantities, such as calorimeter sampling 
energies and track hit positions, are not possible. There is 
no simulation of reconstruction efficiency or misidentifica- 
tion rates, discussed later on, which means the presence 
of genuine physics objects are overestimated while fake 
objects are not modeled, with two exceptions. Because 
jet-flavor tagging efficiencies are applied, fake b-jets and 
taus are simulated. However, ATLFAST-I provides a use- 
ful method of making quick estimates of systematic uncer- 
tainties in early data analyses due to the simple process of 
re-parameterizing the detector and modeling reconstruc- 
tion effects. The speed of operation enables datascts to 
be reproduced with different generator configurations, al- 
lowing quick estimates of systematic uncertainties arising 
from generators. 

Common to the reconstruction of all objects in ATL- 
FAST-I is that by default no reconstruction efficiencies are 
applied. These efficiencies can be taken from full simula- 
tion and accounted for by the user in the analysis. This 
applies to electrons, photons and jets as well as to ATL- 
FAST-I tracks. It should be noted that tagging efficiency 
factors are implicitly taken into account in the tau- and 
b-tagging procedures. A system to apply a common set 
of efficiencies and misidentification rates at the analysis 
stage is in development. The misidentification rates will 
allow the modeling of fake objects as well. 

ATLFAST-I takes input in HepMC format, enabling 
it to read the output of all ATLAS generators. Generator 
input is filtered to choose only particles that are useful 
in the current step. For example, only charged particles 
are considered in the tracking stage, and all particles are 
required to be a part of the final state. 

The following sections describe steps taken in ATL- 
FAST-I. 



the true particle properties by applying parametrized res- 
olution functions which account for the measurement pre- 
cision, energy loss, and multiple scattering as well as for 
hadronic interactions in the inner detector material. The 
resolution functions are taken from fully simulated events. 
The non-Gaussian tails resulting from hadronic interac- 
tions are taken into account by applying a double-Gauss- 
ian correlated smearing to the track parameters of had- 
rons 109 110 . No vertex smearing is applied. In ATL- 
FAST-lTthree types of charged particles are distinguished: 
hadrons, electrons and muons. Due to the relatively large 
energy loss from bremsstrahlung, high-pT electrons are 
treated separately, and an additional energy loss correc- 
tion is applied. It should be noted that while these tracks 
are used for specific studies in B physics, they are not used 
for lepton identification or b-tagging. 



7.2.2 Track-Based Tau Identification 

Track-based tau identification is split into two distinct 
parts, namely reconstruction and identification of tau can- 
didates. The reconstruction part applies a parameterized 
efficiency to the tracks to calculate the charged compo- 
nent in a tau candidate, while the neutral component is 
calculated directly from neutral particles in the generated 
event. 

Once a sample of tau candidates has been reconstruct- 
ed, the identification part is carried out by separating the 
sample into true and fake taus. True taus are defined as 
those matched to a hadronic decay in the truth record 
with AR = \J Arj 1 + A<p 2 < 0.2, whereas the remainder 
are considered fakes. Subsequently, a parametrization of 
the identification efficiency is applied based on the number 
of tracks. 



7.2.1 Tracks 

Charged particle tracks from the generator with p? > 
500 MeV and with \rj\ < 2.5 are considered as reconstruct- 
ed ATLFAST-I tracks, and five track parameter^] are 
associated to them. These parameters are calculated from 



7.2.3 Calorimetry 

Stable charged particles from the event generators are 
propagated through the magnetic field along a simple he- 
lix. The primary vertex is assumed to be at the geomet- 
ric origin. Using a helix model and assuming a perfectly 
homogeneous magnetic field inside the central tracking 



14 The five parameters are: the azimuthal angle <f>; longi- do = ya- 2 +y 2 ; polar angle in 6\ and charge divided by mo- 
tudinal impact parameter, Zo, transverse impact parameter, mentum amplitude. 
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volume, the impact point on the calorimeter surface is 
calculated. To calculate this point, no interactions of the 
particle with the detector material (i.e. no multiple scat- 
tering, energy loss, or nuclear interactions) are taken into 
account. In particular, this implies that no energy loss due 
to bremsstrahlung for electrons and no pair production 
from photons in the inner detector media are simulated 
for the energy depositions in the calorimeters. The effects 
of these interactions are, however, implicitly taken into 
account by the application of appropriate resolution func- 
tions. For the calculation of track parameters, the four- 
momenta and the starting point of the particles (e.g. for 
stable decay products of long-lived particles) are taken 
from the generator information. 

The energies of the electrons, photons, and hadrons 
are deposited in a calorimeter cell map. The response of 
the calorimeter is assumed to be unity and uniform over 
the full detector. No smearing (i.e. no resolution function) 
is applied. The energy of the particle is entirely deposited 
in the hit calorimeter cell, assuming a granularity of the 
calorimeter cell map of r\ x <f> = 0.1 x 0.1 up to \q\ < 
3.2 and r) x <p = 0.2 x 0.2 for 3.2 < \q\ < 5.0. Neither 
lateral nor longitudinal shower development is simulated. 
Therefore, the longitudinal fine structure of the calorime- 
ters is not taken into account. There is also no separation 
between the electromagnetic and the hadronic calorimeter 
compartments. 

Based on the map of deposited cell energies, cluster 



reconstruction is carried out using either SIS Cone 112 
or FastKt algorithms via the Fast Jet libraries 113 . The 



default clustering routine is SISCone with a cone size of 
0.4. The cluster transverse energy must pass a threshold, 
typically 5 GeV. The clusters may get re-classified as elec- 
trons, photons, taus or jets in one of the following steps. 
If they are associated to one of these objects, they are 
removed from the list of clusters. 



7.2.4 Electrons and Photons 

For each true electron or photon, the reconstructed energy 
is obtained by smearing the true energy according to a 
resolution calculated by interpolating between resolutions 
measured in fully simulated events at precise values of 77 
and energy. If, after smearing, the candidate electron or 
photon has transverse energy exceeding a threshold value, 
typically 5 GeV, and has \r)\ < 2.5, then it is recorded with 
the 77 and <p directions of the true particle. 

Electrons and photons are matched to calorimeter clus- 
ters in (77, </>) space, with a maximum allowed separation 
of AR = 0.15. If there is a matching cluster then it is 
removed from the list of clusters to be considered as jet 
candidates later on. 



7.2.5 Muons 

For each true muon with px > 0.5 GeV, the reconstructed 
momentum is calculated from the true muon momentum. 
A Gaussian resolution function which depends on px, f], 



and <j) is applied. After smearing, muons with px > 5 GeV 
and with \r/\ < 2.5 are kept. 



7.2.6 Isolation 

In order to define isolated electrons, photons, and muons, 
the following criteria are applied: the difference in the un- 
smeared energy in a cone of AR = 0.2 around the ob- 
ject direction and its smeared energy needs to be below 
10 GeV. In addition, there should be no further clusters 
reconstructed with AR < 0.4 around the object direction. 



7.2.7 Jets 

All clusters that have not been assigned to a true electron 
or photon are considered jets if their transverse energy 
exceeds 10 GeV. The jet energy is taken to be the cluster 
energy, after adding non-isolated muons within AR = 0.4, 
and is smeared according to the jet energy resolution. 

These functions do not account for pile-up, although 
there is a "high luminosity" mode available which adds 
a pile-up term to the resolution. The pile-up correction 
is constant with respect to jet transverse energy and is 
dependent on the size of the jet. 

The jet direction is taken to be the cluster direction. 
Since the response function of the calorimeter is set to 
one, no jet calibration is needed to correct for the non- 
compensation of the calorimeter. However, an out-of-cone 
energy correction is needed. T his correc tion is applied in 
a separate jet calibration step 109 110 . 



7.2.8 Tagging 

For each jet found, a label is attached to indicate whether 
the true jet originated from a light quark, b-quark, c- 
quark, or tau. This label is based on matching b or c par- 
tons or the visible decay products of hadronically-decaying 
taus at truth-level with AR < 0.3 to a reconstructed jet. 
In the case of hadronically-decaying taus, the ratio be- 
tween the true visible energy and the jet energy is also 
required to be larger than unity minus 2cr, where er is the 
jet energy resolution as above. 

The results of b- and tau-tagging are then simulated 
by applying identification efficiencies and fake rates to the 
labels. These efficiencies are determined from full simula- 
tion studies and are parameterized as a function of px and 
V- 



7.2.9 Missing E T 

The missing transverse energy is calculated from all re- 
constructed objects: isolated electrons, photons, muons, 
taus, jets and non-isolated muons, and remaining calo- 
rimeter clusters not associated to jets. In addition, cells 
not associated to clusters are included in the missing Ex 
calculation. The cell energies are smeared by applying the 
jet resolution functions. 
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7.3 ATLFAST-II 

ATLFAST-II directly simulates the input to the standard 
Athena reconstruction algorithms to mimic the full simu- 
lation. Unlike ATLFAST-I, which provides only momenta 
for the reconstructed objects, reconstructed ATLFAST- 
II output includes all the properties associated with a 
reconstructed object. In the case of Fatras these include 
the hits in the inner detector and muon system, and for 
FastCaloSim these include the energies in the calorime- 
ter cells. Because the standard reconstruction is run, it is 
possible to work with a combination of full and ATLFAST- 
II simulated events without modifying any analysis code. 
Both Fatras and FastCaloSim run together with the event 
reconstruction. The simulation time is reduced by making 
use of the simplified detector description used for recon- 
struction [Il4|. By default, ATLFAST-II uses full simula- 



tion for the inner detector and muon system and FastCalo- 
Sim in the calorimetry. ATLFAST-IIF uses FastCaloSim 
in the calorimetry and Fatras in the inner detector and 
muon system. 

As input, Fatras uses input in HepMC format, per- 
forms a smearing of the primary vertex position to rep- 
resent the luminous region within ATLAS, and records 
truth information in a way similar to the full simulation. 
FastCaloSim uses the truth information of all interacting 
particles at the end of the inner detector volume as input 
to the calorimeter simulation. In order to simulate pile- 
up, generated events must be overlaid prior to detector 
simulation. 



7.3.1 Fatras 

ATLFAST-II with the fast track simulation engine Fa- 
tras (ATLFAST-IIF) reduces simulation time in the in- 
ner detector and muon system. Fatras is an ATLAS spe- 
cific development and establishes a complete simulation 
within the track reconstruction framework. The recon- 
struction geometry is a simplified description of the full 
detector geometry, which keeps the same descriptive accu- 
racy for sensitive detector parts, while approximating all 
other detector components as simplified layers that carry a 
high-granularity density map. This detector material de- 
scription can be sufficient. A factor of 100 reduction in 
CPU time is obtained with only small physics performance 
degradation. The propagation of the particles through the 
tracking detectors is carried out by the extrapolation en- 
gine 
tions. 



115 used in the offline track reconstruction applica- 



The interactions of the particles with the simplified 
detector layers are simulated using several methods. Mul- 
tiple Coulomb scattering is implemented as a Gaussian 
mixture model to account for tail effects from single large- 
angle scattering processes; ionization and radiative en- 
ergy loss are simulated according to the Bethc-Bloch and 
Bethe-Heitler models; conversion of a photon into an elec- 
tron and positron is carried out depending on the thick- 
ness of the traversed material; hadronic interactions of 
particles with the detector layers are simulated from a 



parametric model that has been obtained from Geant4 
simulation results. The decay of unstable particles is en- 
hanced b y a d edicated wrapper of the associated Geant4 
modules |111| . The calorimeter simulation of ATLFAST- 
IIF is typically FastCaloSim, and Fatras provides the in- 
put particle collection. Energy deposition for muons in 
the calorimeter layers is also recorded according to the 
material description of the reconstruction geometry and 
is further used for cluster simulation in the FastCaloSim 
application. 

Fatras was first established as a validation tool for the 
newly deployed inner detector reconstruction sequence. It 
has already been used for noise studies in the Transition 
Radiation Tracker and first simulations for a potential fu- 
ture upgrade of the ATLAS inner detector. The validation 
of Fatras against the full simulation results to be used for 
first collisions data from LHC is ongoing. An extension of 
the fast track simulation within the reconstruction geome- 
try has taken place that also allows the use of Fatras in the 
muon spectrometer. The particles being simulated at the 
end of the inner detector volume are filtered. Muons are 
transported through the calorimeter, and their deposited 
energy is stored as an input to the FastCaloSim module. 
The trajectories of the muons are then simulated in the 
muon spectrometer, and the hits within sensitive detector 
elements are recorded. Standard digitization is applied on 
top of the simulated hits to account for the detailed cali- 
bration that must be included for a comparison to data. 



7.3.2 FastCaloSim 

Instead of simulating the particle interactions with the 
detector material, the energy of single particle showers is 
deposited by FastCaloSim directly using parametrizations 
of the longitudinal and lateral energy profile. The distri- 
bution of active and inactive material in the calorimeter 
needs to be respected by the parametrization, so a fine 
binning of the parametrization in the particle energy and 
pseudorapidity is needed. Furthermore, the energy depo- 
sition depends strongly on the origin of the shower in the 
calorimeter, so all parametrizations are also binned versus 
the longitudinal depth of the shower center. 

The parametrizations are based on a 30 million event 
sample of fully-simulated (i.e. simulated with Geant4) 
single photons and charged pions in an energy range be- 
tween 200 MeV and 500 GeV, evenly distributed in \r]\ < 
5.0 and — it < <j) < ir. All electron and photon showers 
are approximated by the photon parametrization and all 
hadronic showers are approximated by the charged pion 
parametrization. The simplified reconstruction geometry 
of the calorimeter is used with details at the level of the 
readout cells. 

The parameterization of the longitudinal energy dis- 
tribution is constructed from histograms of the total en- 
ergy in all calorimeter layers, the longitudinal depth of 
the shower center, and the energy fraction in each layer 
for the fully-simulated single-particle events. The dom- 
inant correlations between fractional energy deposits in 
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each calorimeter layer (i.e. those related to the longitu- 
dinal depth of the shower's origin) are accounted for in 
the parameterization binning. Gaussian correlations be- 
tween fractional energy deposits in each calorimeter layer 
(i.e. those describing shower development) are stored in 
a correlation matrix and are applied to improve the pa- 
rameterized energy distribution. During fast simulation, 
the parametrization closest in energy and pseudorapidity 
to the particle is taken, and then the total shower en- 
ergy and the shower depth are chosen randomly from the 
stored histograms and rescaled to match the true parti- 
cle energy. It was found that after rescaling no interpo- 
lation between parametrizations is necessary. Afterwards, 
the energy fractions in all calorimeter layers are gener- 
ated randomly, taking into account the correlation ma- 
trix. The lateral energy distribution inside each calorime- 
ter layer is simulated using a symmetric average radial 
shape function. The shape functions are extracted from 
fits to fully-simulated single-particle events and are con- 
structed for bins of particle type, primary particle energy, 
position in 77, and shower depth in the calorimeter. The 
asymmetry of shower shapes for particles entering the ca- 
lorimeter at large incident angles is absorbed in a shape 
function describing a pseudorapidity-dependent asymme- 
try term. During simulation, the energy of a calorimeter 
cell is determined by the integral of the shape function 
over the cell surface area. Fluctuations derived from the 
intrinsic resolutions of each calorimeter are applied to the 
cell energy. The total energy of all cells in one calorimeter 
layer is normalized to the total energy in the layer making 
use of the longitudinal shower shape. 

The histograms and shape functions needed as input 
for the parametrizations use about 200MB of memory. 
Since no simulation of particle interactions is done, the 
dominant part of the simulation time is spent on the nu- 
merical integration of the lateral shape functions. Overall, 
the calorimeter simulation time for a single particle is a 
few microseconds, and a typical (e.g. tt) event needs a few 
seconds. 

The parameterization of FastCaloSim differs in several 
important ways from that of the Fast G4 Simulation. Fast- 
CaloSim fills the readout geometry of ATLAS and applies 
a parameterization from the edge of the inner detector, 
whereas the Fast G4 Simulation places hits like those of 
Geant4 into the full ATLAS detector geometry and is 
only applied in the sampling portion of the calorimeter 
(e.g. excluding the cryostats surrounding the calorimetry). 
As a result, the Fast G4 Simulation output can be run 
through the standard digitization software, whereas the 
FastCaloSim output is fed directly into the reconstruction. 



7.4 Computing Performance 

Examples of simulation times in kSI2K seconds [116] for 
various types of e vents in the full and fast simulations are 
In single central (\i]\ < 3) electron 



events the simulation time is decreased by a factor of ten 
or more by the fast G4 simulation, and in hard scattering 
events the simulation time is decreased by a factor of 2-5. 
ATLFAST-II without Fatras decreases simulation time by 
a factor of 20-40, and ATLFAST-IIF decreases simulation 
time by a factor of 100. FastCaloSim accounts for about 
10% of the total simulation time in ATLFAST-II and 60- 
70% of the total simulation time in ATLFAST-IIF. ATL- 
FAST-I requires a relatively negligible amount of CPU 
time even for hard scattering events. ATLFAST-I, Fast- 
CaloSim, and Fatras run during the reconstruction step, 
but for these purposes the time consumed by their meth- 
ods is included in "simulation time." Figure [8] shows the 
distribution of simulation times per event for full, Fast G4, 
and ATLFAST-II simulation of 250 tt events. Here, the av- 
erage time required to run FastCaloSim on these events 
has been added to the full inner detector and muon simu- 
lation time of each event. The distributions are similar in 
shape. 

In evaluating these CPU times, it is necessary to keep 
in mind the additional steps required before analysis of 
the data can be performed. For both full and fast G4 sim- 
ulation, the data must be digitized and reconstructed. For 
ATLFAST-II, the inner detector and muon system must 
be digitized^ and reconstructed, but the calorimeter re- 
quires only reconstruction. For ATLFAST-IIF, only the 
muon system must be digitized before reconstruction is 
performed. The output of ATLFAST-I is in a format sim- 
ilar to that of the reconstruction and needs no further 
processing. The CPU time required for these additional 



steps is given in Table 18 



7.5 Physics Performance 

The fast simulations have been compared to full simula- 
tion in both low-level analyses with single particles enter- 
ing the calorimeter and high-level analyses of detector ob- 
servables with jets and active hard scattering events. The 
Fast G4 Simulation agrees to about 1-2% in jet energy 
scale after the standard calibration procedure and agrees 
to within 5% percent in electron identification efficiencies. 
Due to the simplifications in the calorimeter simulation, 
FastCaloSim differs at the 5% level from full simulation 
after reconstruction, especially in properties that are sen- 
sitive to the shape of hadronic showers. The jet energy 
scale differs by 1-2% after recalibration, and electron iden- 
tification efficiency differs by about 5%. Since all parti- 
cles are simulated using an average lateral shape function, 
visible effects like electromagnetic subshowers in charged 
pion showers are not described. These differences can be 
reduced by applying additional object-dependent correc- 
tion functions after reconstruction. Fakes and calorimeter 



provided in Table It 
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15 Measurements were performed on Sun Fire X2200 M2 units 
with dual dual-core 2.6 GHz AMD Opteron 2218 processors. 



Normalization was done using the peak specmark int 2000 rat- 
ing 1794. For the same system, the peak specmark floating 
point 2000 rating was 3338. The normalization follows the pub- 
lished results, rather than the WLCG formula in [117] . Details 
of cross-platform benchmarking can be found in [1181. 
16 The inner detector and muon system together require 
about 2/3 of the total digitization time. 



43 

Table 15. Simulation times per event, in kSI2K seconds, for single particles generated with \r/\ < 3.0 and with the same 
transverse momentum. All times are averaged over 500 events. ATLFAST-II uses full simulation for the inner detector and 
muon system and FastCaloSim in the calorimetry. ATLFAST-IIF uses FastCaloSim in the calorimetry and Fatras in the inner 
detector and muon system. 



Sample 


Full Sim 


Fast G4 Sim 


ATLFAST-II 


ATLFAST-IIF 


ATLFAST-I 


5 GeV fj* 


0.879 


0.899 


1.28 


0.633 


0.011 


50 GeV n* 


1.63 


1.15 


2.71 


0.606 


0.011 


500 GeV ^ 


12.0 


10.4 


11.8 


0.615 


0.011 


1 GeV e ± 


3.62 


0.734 


0.825 


0.513 


0.011 


5 GeV e ± 


17.8 


1.64 


1.00 


0.542 


0.011 


50 GeV e ± 


179. 


4.86 


1.25 


0.588 


0.013 


1 GeV tt* 


2.40 


1.48 


0.701 


0.515 


0.011 


5 GeV tt* 


10.4 


4.27 


0.811 


0.540 


0.011 


50 GeV tt* 


94.7 


30.3 


1.04 


0.569 


0.011 



Table 16. Simulation times per event, in kSI2K seconds, for the full simulation, Fast G4 simulation, ATLFAST-II, ATL- 
FAST-IIF, and ATLFAST-I. ATLFAST-II uses full simulation for the inner detector and muon system and FastCaloSim in the 
calorimetry. ATLFAST-IIF uses FastCaloSim in the calorimetry and Fatras in the inner detector and muon system. All times 
are averaged over 250 events, except heavy ion times which were averaged over only 50 events. Because the memory required to 
reconstruct heavy ion events exceeds 3 GB and because FastCaloSim runs during the reconstruction step, the amount of time 
taken by FastCaloSim could not be measured in that sample. It was estimated as 10% of the full inner detector simulation time, 
consistent with the other hard scattering events. 



Sample 


Full Sim 


Fast G4 Sim 


ATLFAST-II 


ATLFAST-IIF 


ATLFAST-I 


Minimum Bias 


551. 


246. 


31.2 


2.13 


0.029 


tt 


1990 


757. 


101. 


7.41 


0.097 


Jets 


2640 


832. 


93.6 


7.68 


0.084 


Photon and jets 


2850 


639. 


71.4 


5.67 


0.063 


W ± -> e ± v £ 


1150 


447. 


57.0 


4.09 


0.050 


W ± -> /x^ 


1030 


438. 


55.1 


4.13 


0.047 


Heavy ion 


56,000 


21,700 


~3050 


203 


5.56 



punch-through are not well modeled in ATLFAST-II and 
ATLFAST-IIF. 

Figure [9] shows missing transverse energy along the x- 
axis for the full and fast simulations in di-jet events with 
a leading parton pt between 560 and 1120 GeV, as well 
as jet pt resolution as a function of r\ in tt events for jets 
with 20 < p% rue < 40 GeV. ATLFAST-II and the Fast 
G4 Simulation agree well with full simulation in missing 
transverse energy spectrum, even in the tails of the dis- 
tribution. ATLFAST-I does not sufficiently populate the 
tails of the missing transverse energy distribution, and 
ATLFAST-IIF has too wide a distribution. ATLFAST-I, 
ATLFAST-IIF, and ATLFAST-II show 10-20% deviations 
from full simulation in jet transverse momentum resolu- 
tion. Fast G4 simulation is consistent with full simula- 
tion through the entire range in pseudorapidity. Figure [10] 
shows reconstructed muon px resolution as a function of 
muon pr in Z — » /i + /u,~ events. Muons reconstructed using 
the muon spectrometer alone and those reconstructed us- 
ing both the muon spectrometer ( "standalone" ) and inner 
detector ( "combined" ) are shown. Only one type of muon 
is provided by ATLFAST-I, so it is only included in the 
combined reconstruction plot. In the cases of ATLFAST-II 



and the Fast G4 simulation, muon spectrometer simula- 
tion is done by Geant4 and should, therefore, be identi- 
cal to full simulation. The fast simulations show generally 
good agreement over the entire range oipr- ATLFAST-IIF 
has standalone muon resolution that is 10% better than 
full simulation in some bins of pt, but since the muon 
system simulation of ATLFAST-IIF is still under develop- 
ment, the agreement is expected to improve. It is generally 
left to the physics groups to evaluate the fast simulations 
with their analyses and determine which is acceptable. 



8 Validation 

Validation of the ATLAS simulation chain is done in two 
distinct phases. First, the software performance must be 
assessed. Then, the physics performance must be tested 
and compared to available data. The first step includes 
testing robustness, testing software performance, and test- 
ing basic functionality. The second step includes compar- 
ison to test beam, cosmic data, and physics results ob- 
tained from previous simulation productions. In this sec- 
tion the infrastructure for each stage of validation is de- 
scribed. 
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Fig. 8. Distributions of CPU time for 250 ti events in full, Fast G4, and ATLFAST-II simulations. Vertical dotted lines denote 
the averages of the distributions. 
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Fig. 9. Left, fast simulations (color) and full simulation (black) comparison of missing transverse energy along the a;-axis in 
di-jet events with a leading parton pr between 560 and 1120 GeV. Right, a comparison of jet pr resolution as a function of 
pseudorapidity in tt events for jets with 20 < pTp rue < 40 GeV. 



ATLAS has a fresh software build every night for coor- 
dinating software development. Each nightly build is run 
through a rigorous test cycle, and as a release deadline 
approaches the test results are increasingly scrutinized to 
evaluate stability and performance. Thanks to the evalu- 
ation prior to release, generally only rare bugs appear in 
production for the first time. The automatic testing infras- 
tructure also allows evaluation of many different versions 
of the Athena software. Separate bug-fixing and develop- 
ment branches are employed, for example, and significant 



interface changes or low-level code migrations take place 
in separate branches until they are sufficiently stable to be 
merged into the main branch. Each version of the software 
comes in several flavors for different system architectures, 
operating systems, compilers, and so on. The simul ation 
production in 2008 used 32-bit builds with gcc 3.4.6 |119 



on CERN's Scientific Linux 4 [120) . External dependencies 
include CLHEP 1.9.3.1 and WLCG 54G. 

A web portal has been constructed, using the Savannah 
bug tracking software 121 , for monitoring problems with 
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Fig. 10. Fast simulations (color) and full simulation (black) comparison of reconstructed muon pr resolution as a function of 
muon pt for central (\r)\ < 1.2) muons in Z — > /i + /i~ events for muons reconstructed using only the muon spectrometer (left) 
and using both the inner detector and spectrometer (right). ATLFAST-I only provides one type of muon, which is included in 
the right plot. 



the various facets of the simulation software. Bugs can be 
reported and tracked as they are diagnosed and solutions 
are found, and new features can be requested. 



8.1 Automated Testing 

The software performance of the simulation is monitored 
in three types of automated tests: ATLAS Testing Nightly 
(ATN) tests, Run Time Tester (RTT) tests, and Full Chain 
Tests (FCT). ATN tests are run every night on every soft- 
ware build and are basic functionality tests. RTT tests are 
run on a subset of builds and include 50 simulation tests 
to ensure functionality and, in some cases, consistent re- 
sults. FCT tests are run on only a few builds each night 
and test the entire software chain in a production-like envi- 
ronment. Releases are required to pass a minimal number 
of milestones before being declared ready for production. 
For details about the ATN and RTT, refer to |9]. 

FCT tests are run daily on a small set of jobs. The aim 
of the FCT is to verify the readiness of a production cache 
release candidate for Grid production. The FCT runs jobs 
that test the functionalities of generation, the different fla- 
vors of fast and full simulation, digitization, bytestream 
conversion, and reconstruction of Standard Model pro- 
cesses, black hole production, and heavy ion collisions. In 
the case of Standard Model processes, a full chairp^jof jobs 
are run per release, with hard scattering events that stress 
the software. If successful, the output from each day's run 



17 A single chain of jobs runs all steps from event generation 
through reconstruction, sequentially, using the output of one 
step as the input of the next. 



is saved for use in the next step of the test the following 
day (e.g. Monday's generation provides input for Tues- 
day's simulation, which provides input for Wednesday's 
digitization) . The typical number of events processed (50) 
is limited by the CPU requirements for the full simulation. 
As part of the FCT, 1000 events that were simulated 
with an old validated release are reconstructed. This long 
test allows better evaluation of the reconstruction's stabil- 
ity. Moreover, the relatively large sample is used to make 
a preliminary check on the quality of the reconstruction 
for final state objects (jets, electrons, muons, etc.). All the 
other tests only check for the success or failure of the job, 
the number of events in the output file, and unknown er- 
ror messages in the log file. If any of these checks fail, the 
release candidate is rejected, and an additional iteration 
of bug fixing is undertaken. Only once a release succeeds 
in all FCT tests is it distributed to the Grid. 



8.2 Computing Performance Benchmarking 

Event generation jobs are typically fast enough that not 
a large effort is made to test their software performance. 
In general, the jobs take a tens of milliseconds per event. 
Generation with Pythia or Herwig requires about 450 
MB of memory, and generation with Hijing requires about 
170 MB of memory. The files produced in generation jobs 
are tens of kB per event (e.g. 40 kB/event for ii events). 
CPU time, memory consumption, and output file size 
for the simulation are tested in each stable release using a 
variety of physics processes. Single muons, electrons, and 
charged pions are used, as well as di-jets in bins of leading 
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parton pt, a supersymmetric benchmark point| 18 [ mini- 
mum bias, Higgs boson decaying to four leptons, Z — > 
e + e~, Z — > /i + /i~, and Z — > t + t~ events. The same in- 
put events are used every time to ensure fair comparison 
of simulation results independent of changes to the event 
generation. 

Simulation of the full detector requires typically ~ 
750 MB of memory (i.e. VSIZE) and includes loading 
of almost 400 libraries into memory. Memory consumed 
during simulation is also broken down into its three key 
components by taking snapshots of memory use during ini- 
tialization: GeoModel, the ATLAS-side detector geometry, 
typically about ~ 100 MB; G4Atlas, the purely Geant4 
component of the memory, typically about ~ 300 MB; 
and load modules, the remaining algorithms and services 
loaded during the job, typically about ~ 300 MB. Signifi- 
cant changes in any of these can indicate the proper source 
of a change in memory. The memory requirement is inde- 
pendent of the number of events in the job and varies 
by only a few percent for different physics processes. Al- 
though up to 2 GB of memory may be reserved for a Grid 
job, by keeping the memory requirements of a typical sim- 
ulation job under 1 GB, more machines can be used. The 
memory required to build each piece of the detector can 
be found in Table HI Of major concern is any increase in 
memory ( "leaks" ) during the event loop once all libraries 
have been loaded and setup is complete. Some increase 
due to caching is expected during the processing of the 
first few events. However, if the memory required by the 
application continues to grow beyond the system limits, 
memory corruption and memory pressure can result in se- 
rious problems. The memory required by the ATLAS sim- 
ulation has been found to increase by less than 0.25 MB 
per event under normal circumstances. The increases are 
not steady, but come in large (~ 10 MB) and sporadic 
jumps. The source of these increases is not fully under- 
stood, but a 50 event simulation job still consumes well 
under 1 GB of memory. 

The CPU time consumed by generation, full simula- 
tion, digitization, and reconstruction for various types of 
events is shown in Tables 17 and [18] and is typically several 
minutes per hard scatt ering event. All times are normal- 
ized to kSI2K seconds [116| .For the purposes of testing, 
logfile output was suppressed and no output files (e.g. hit 
or RDO files) were created. CPU time is also measured 
as a function of other simulation input parameters prior 
to significant changes, for example using different physics 
lists. For these runs, output files were disabled; in simu- 
lating tt events the time per event is increased by ~ 0.5% 
when file writing is enabled. The hard scattering events 
shown in Table [18] were generated with a 14 TeV center of 
mass energy; for 10 TeV center of mass energy the simula- 
tion time is reduced by 17% for tt events. The distribution 
of CPU time for simulation of 250 tt events is shown in 
Figure [HJ 

For the samples in tablcs[l8]and[T6j event generation of 
W production, minimum bias interactions, di-jet events, 



and photon and jet events was done using Pythia. tt pro- 
duction was done using MC@NLO for the hard scattering 
and Herwig for hadronization and showering. Heavy ion 
event production was done using Hijing. 

Digitization jobs are generally fast, but memory con- 
sumption can be a serious concern during jobs with many 
overlaid events. Table [19] shows how resource consump- 
tion duringdigitization of 50 tt events scales with pile-up 
luminosit}j 19 [ The memory required for digitizing with a 
luminosity of 10 34 cm~ 2 s _1 is sufficiently large that the 
memory limit of the testing machine was reached, and, 
therefore, swapping resulted in a significant increase in 
CPU time. The allocated memory per event is provided 
for some benchmark of the change in memory over the 
course of a single event. 



8.3 Physics Validation 

Once a new release is distributed to the Grid sites, a set of 
several physics samples is produced. Typically, a "valida- 
tion sample" includes 10,000 events for each process, a to- 
tal of 110,000 single particle events and 250,000 hard scat- 
tering events. This standard validation sample includes 
single muons, pions, and electrons, Standard Model pro- 
cesses (tt production, vector boson production, B-phys- 
ics), and exotic processes (e.g. supersymmetric events and 
black hole production) . The composition of the validation 
sample has been chosen to test all aspects of the event 
reconstruction. 

The running of the validation sample on the Grid usu- 
ally exposes rare software problems in the release. It is 
unlikely that software bugs that appear with a frequency 
much lower than 1/1000 events are caught by the auto- 
matic validation procedure (1000 events is the size of the 

This 



"long" jobs of the FCT, described in Section 8.1 



first round of production provides a feedback mechanism 
for the developers, who produce bug fixes before the next 
production cycle. 

The last step before using a release for production is 
physics validation. A dedicated group of experts, includ- 
ing representatives from every detector performance (e.g. 
tracking, b-tagging, and jet reconstruction groups) and 
physics group (e.g. Standard Model, supersymmetry, and 
exotics search groups) in ATLAS, runs physics analyses 
on the validation samples. Their task is to verify the qual- 
ity of the single object reconstruction (e.g. jets, electrons, 
and muons) and the results of more complex physics anal- 
yses (e.g. mass reconstruction in Z — > fi + n~ , Z — > e + e~, 
and tt events). The relatively large validation samples 
may expose minor problems that could not be found with 
lower statistics, for example a shift of a few percent in 
the reconstructed energy. In order to properly validate 
each version of the software, the results from each release 



18 ATLAS mSUGRA benchmark point SU3: m =100 GeV, 
m 1/2 =300 GeV, A = -300, tan/3 = 6, and n > 0. 



19 The ATLAS software is under a continuous process of im- 
provement, with improving performance in terms of calculation 
speed and memory profile, and problems such as memory leaks 
being identified and eliminated. 
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Table 17. Simulation times per event, in kSI2K seconds, for single particles generated with \r)\ < 3.0. All times are averaged 
over 500 events. For timing, logfile output and output files were suppressed. 



Particle 



p T = 1 GeV p T = 5 GeV p T = 50 GeV p T = 500 GeV 



Single electrons (e ) 
Single muons (/x ) 
Single charged pions (ir ) 



3.62 



2.40 



17.8 


179. 


0.879 


1.63 


10.4 


94.7 



12.0 



Table 18. Generation, simulation, digitization, and reconstruction times per event, in kSI2K seconds. Generation times are 
averaged over 5000 events, except generation of heavy ion events, which were averaged over 250 events. The generation time 
for ti events includes only the hadronization time, not the time consumed by MC@NLO generation. Simulation, digitization, 
and reconstruction times are averaged over 250 events, except simulation of heavy ion events, which were averaged over 50 
events. The heavy ion event simulation time is for events with a random impact parameter. Central collisions require on average 
3.4 times longer to simulate. Reconstruction time can vary dramatically depending on the algorithms run and the trigger 
configuration. These times should be taken as indicative of the order of magnitude, rather than as a precise measurement. 
During reconstruction of heavy ion events, the testing machines ran out of memory. Based on a previous release, heavy ion 
collision reconstruction is estimated to take ~ 10 times longer than ti event reconstruction. For timing, logfile output and output 
files were suppressed. 



Sample 


Generation 


Simulation 


Digitization 


Reconstruction 


Minimum Bias 


0.0267 


551. 


19.6 


8.06 


ti Production 


0.226 


1990 


29.1 


47.4 


Jets 


0.0457 


2640 


29.2 


78.4 


Photon and jets 


0.0431 


2850 


25.3 


44.7 


W ± -» e ± u e 


0.0788 


1150 


23.5 


8.07 


W ± -> v^Vp 


0.0768 


1030 


23.1 


13.6 


Heavy ion 


2.08 


56,000 


267 


- 



Table 19. Digitization computing resources for 50 ti events as they scale with luminosity. CPU times are normalized to the 
time required by a no pile-up job. Cavern background was overlaid during these jobs with a safety factor of one. Beam gas and 
beam halo were ignored. 



Resource 



No Pile-Up 



CPU Time Factor 1.0 

Memory Leak [kB/event] 10 

Virtual Memory [MB] 770 

Allocated Memory [MB/event] 12 



2.3 


5.8 


270 


800 


1000 


1300 


21 


40 



160 
2100 
2000 
985 



are typically compared to those of previous validated re- 
leases. The software must, therefore, maintain backwards- 
compatibility in order to allow fair comparisons. Shifts in 
file format are carefully coordinated, and maintenance of 
the old format is continued for as long as necessary to en- 
sure result consistency. The physics validation procedure 
is also used for checking major changes in the fast and full 
simulation (detector description, change in the simulation 
parameters, etc.). 

The Geant4 simulation has also been validated in a 
physics sense with all available detector data. Combined 
test beam studies have proven invaluable in understand- 
ing the performance of each of the subsystems, and the 
standalone test beam analyses have provided crucial in- 
put towards the optimization of the simulation and choice 
of parameters 122 - 125 . In 2008, a significant sample of 



The data have provided an important test of the simula- 
tion fl26l 



cosmic ray data was collected with multiple subdetectors. 



Although the detector simulation relies heavily on Ge- 
ANt4, a significant effort was put into comparing tile ca- 
lorimeter test stand response with the Fluka simulation 
toolkit |127| . For this comparison, the test stand geometry 
was translated into the Fluka geometry format, and the 
output from the Fluka simulation was translated back 
into a format comparable to that of Geant4. It was even- 
tually concluded that little would be gained by attempting 
a transition to Fluka that could not already by gained 
by modifications to parameters and a different choice of 
physics models within Geant4. Fluka has also been used 
to study neutron flux and radiation levels throughout the 
detector [103) , but many of these studies are being up- 
dated in Gean t4 w ith the high-precision neutron physics 
list (see Section 5.4). 
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Extensive efforts are underway to compare simulated 
data to real data and validate the output from each de- 
tector. Thanks to the multiple detector descriptions, sev- 
eral analyses have already been prepared and tested to 
find discrepancies between the detector description of the 
simulation and that of the as-built detector. For example, 
subdetectors can be "weighed" in the simulation to ensure 
that the amount of material is within a few percent of the 
constructed detector. Although the agreement with first 
high-energy collision data is not expected to be perfect, 
a great deal of experience has been gained. The effects of 
modifications to Geant4 parameters have also been stud- 
ied in some detail, so that differences between data and 
simulation might be remedied rapidly. 

Digitization algorithms have been tuned against labo- 
ratory test results, test beam data, and, where possible, 
cosmic ray data taken during the detector commissioning. 
The studies continue with the data. 



9 Summary and Conclusions 

We have presented the status of the ATLAS simulation 
project, including all steps from event generation to dig- 
itization. A robust and flexible framework is required to 
cope with the demands of complex detector descriptions 
and physics models. The software project has been pre- 
pared for data since late 2008 and is ready for data. 

A variety of event generators are available to provide 
the user with a complete set of tools for testing new phys- 
ics models. The simulation is highly configurable to ensure 
maximal flexibility in the face of the uncertain challenges 
approaching. The detector description itself, conditions of 
the detector, and many parameters used in the simulation 
can be modified at run time. The digitization is also made 
configurable to cope with uncertainty in machine perfor- 
mance, detector conditions, and cavern conditions. Three 
varieties of fast simulation have been made available to 
ease the difficulties caused by the time consumption of 
the full detector simulation. They each complement the 
full simulation. 

Generation, simulation, and digitization tasks are run- 
ning continually on the Grid. The validation program has 
produced a high quality simulation sample for the ATLAS 
experiment data. 

We are greatly indebted to all CERN's departments and to the 
LHC project for their immense efforts not only in building the 
LHC, but also for their direct contributions to the construction 
and installation of the ATLAS detector and its infrastructure. 
We acknowledge equally warmly all our technical colleagues 
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